Life Long Programmer's Community Log: 2013-12-29

Next integer with same number of set bits

Please Visit: http://lifelongprogrammer.blogspot.com

Next integer with same number of set bits

http://algorithmsforever.blogspot.com/2011/11/next-integer-with-same-number-of-set.html

from Google Plus RSS Feed for 101157854606139706613 http://algorithmsforever.blogspot.com/2011/11/next-integer-with-same-number-of-set.html

via LifeLong Community

How to Rock an Algorithms Interview | Palantir

Please Visit: http://lifelongprogrammer.blogspot.com

How to Rock an Algorithms Interview

http://www.palantir.com/2011/09/how-to-rock-an-algorithms-interview/

from Google Plus RSS Feed for 101157854606139706613 http://www.palantir.com/2011/09/how-to-rock-an-algorithms-interview

via LifeLong Community

Nutch2Crawling - Nutch Wiki

Please Visit: http://lifelongprogrammer.blogspot.com

Nutch2Crawling

https://wiki.apache.org/nutch/Nutch2Crawling

IndexorJob

currentJob.setOutputFormatClass(GoraOutputFormat.class);

DataStore<String, WebPage> store = StorageUtils.createWebStore(currentJob.getConfiguration(),

String.class, WebPage.class);

GoraOutputFormat.setOutput(currentJob, store, true);

GeneratorJob

getConf().set(URLPartitioner.PARTITION_MODE_KEY, URLPartitioner.PARTITION_MODE_HOST);

org.apache.nutch.crawl.URLPartitioner.getPartition(String, int)

if (mode.equals(PARTITION_MODE_HOST)) {

hashCode = url.getHost().hashCode();

} else if (mode.equals(PARTITION_MODE_DOMAIN)) {

hashCode = URLUtil.getDomainName(url).hashCode();

} else { // MODE IP

InetAddress address = InetAddress.getByName(url.getHost());

hashCode = address.getHostAddress().hashCode();

}

// make hosts wind up in different partitions on different runs

hashCode ^= seed;

return (hashCode & Integer.MAX_VALUE) % numReduceTasks;

org.apache.nutch.crawl.TestURLPartitioner

FetcherJob

Fetches all urls which have been marked by the generator with a given batch ID (or optionally fetch all urls)

How FetcherReducer works?

Parse

Parses all webpages from a given batch id.

parseUtil.process(key, page);

from Google Plus RSS Feed for 101157854606139706613 https://wiki.apache.org/nutch/Nutch2Crawling

via LifeLong Community

Using Mergeindexes and PowerShell to Automate Deployment of Solr Index to Remote Production Machines

Please Visit: http://lifelongprogrammer.blogspot.com

Using Mergeindexes and PowerShell to Automate Deployment of Solr Index to Remote Production Machines

http://lifelongprogrammer.blogspot.com/2014/01/using-mergeindexes-powershell-to-auto-deployment-of-solr-index.html

Describe how to use powersehll and Solr Mergeindexes Feature to automate the deployment of solr index to remote production machines.

from Google Plus RSS Feed for 101157854606139706613 http://lifelongprogrammer.blogspot.com/2014/01/using-mergeindexes-powershell-to-auto-deployment-of-solr-index.html

via LifeLong Community

7-Zip Command-Line Usage

http://www.dotnetperls.com/7-zip-examples
Command a: add
This command stands for "archive" or "add." Use it to put files in an archive.
7za a -t7z files.7z *.txt

Command d: delete
7z d archive.zip *.bak -r
-r: traverse all subdirectories

Command e: extract
7z e archive.zip

Command l: list contents of archives
7za l files.7z

Command t: test integrity
7z t archive.zip *.doc -r

Command u: update
7z u archive.zip *.doc

Command x
This command is like "e" except it preserves the full paths.
7z x archive.zip

Switch m: compression level
-mx0 to -mx9

Switch t type
-t7z -tzip -ttar -tgzip -tbzip2 -tiso -tudf

Switch o
We show the "o" switch on the 7-Zip command line. Sometimes you do not want to extract to the current directory. This is where -o can come in handy.
7z x archive.zip -oC:\Doc
-oC:\soft: the destination folder (-o is the switch and C:\soft is the argument)

7z e archive.zip -oC:\soft *.cpp -r

Switch ao
The "ao" switch allows you to specify whether you want to overwrite old files.
Switch -aoa:
This switch overwrites all destination files.
Use it when the new versions are preferred.

Switch -aos:
Skip over existing files without overwriting. Use this for files where the earliest version is most important.

Switch -aou:
Avoid name collisions. New files extracted will have a number appending to their names. You will have to deal with them later.

Switch -aot:
Rename existing files. This will not rename the new files, just the old ones already there.

Nutch2: Extend Nutch2 to Crawl via Http API

Please Visit: http://lifelongprogrammer.blogspot.com

Nutch2: Extend Nutch2 to Crawl via Http API

http://lifelongprogrammer.blogspot.com/2013/08/extend-nutch2-to-crawl-via-http-request.html

Describe how to create a ServerResource in Nutch side to expose http API to start/stop/edit/delete a task to crawl a site and monitor crawl status.

from Google Plus RSS Feed for 101157854606139706613 http://lifelongprogrammer.blogspot.com/2013/08/extend-nutch2-to-crawl-via-http-request.html

via LifeLong Community

Windows Powershell Common Folder/File Operation

Create folder
mkdir c:\f1\f2\f3
md c:\f1\f2\f3
New-Item c:\f1\f2\f3 -ItemType directory

rm -r c:\f1\f2\f3

Create a file
New-Item c:\f1\f2\f3 -ItemType file -force -value "hello world"

cat c:\f1\f2\f3

Using the Set-ExecutionPolicy Cmdlet

When hit: cannot be loaded because the execution of scripts is disabled on this system
Set-ExecutionPolicy RemoteSigned
Restricted - No scripts can be run. Windows PowerShell can be used only in interactive mode.
AllSigned - Only scripts signed by a trusted publisher can be run.
RemoteSigned - Downloaded scripts must be signed by a trusted publisher before they can be run.
Unrestricted - No restrictions; all Windows PowerShell scripts can be run.
http://technet.microsoft.com/en-us/library/ee176961.aspx

PowerShell: Running Executables

http://social.technet.microsoft.com/wiki/contents/articles/7703.powershell-running-executables.aspx Invoke-Item (II)
Why: Forces the default action to be run on the item.
Invoke-Item *.pdf

The Call Operator &
Why: Used to treat a string as a SINGLE command. Useful for dealing with spaces.
In PowerShell V2.0, if you are running 7z.exe (7-Zip.exe) or another command that starts with a number, you have to use the command invocation operator &.
The PowerShell V3.0 parser do it now smarter, in this case you don’t need the & anymore .
&7z

Invoke-Expression (IEX)
Why: Easy to execute a string. This can be VERY dangerous if used with user input (unless that input has been carefully validated).
$str = "get-process"
Invoke-Expression $str

http://gallery.technet.microsoft.com/scriptcenter/PowerShell-and-7Zip-83020e74
if (-not (test-path "$env:ProgramFiles\7-Zip\7z.exe")) {throw "$env:ProgramFiles\7-Zip\7z.exe needed"}
set-alias sz "$env:ProgramFiles\7-Zip\7z.exe"
sz a -t7z "$directory\$zipfile" "$directory\$name"

[Diagnostics.Process] Start()
[Diagnostics.Process]::Start("notepad.exe","test.txt")
$ps = new-object System.Diagnostics.Process
$ps.StartInfo.Filename = "ipconfig.exe"
$ps.StartInfo.Arguments = " /all"
$ps.StartInfo.RedirectStandardOutput = $True
$ps.StartInfo.UseShellExecute = $false
$ps.start()
$ps.WaitForExit()
[string] $Out = $ps.StandardOutput.ReadToEnd();

Linux: Find Out Which Process Is Listening Upon a Port

netstat -tulpn | grep :80
-p, --program
Show the PID and name of the program to which each socket belongs.
-l, --listening
Show only listening sockets. (These are omitted by default.)
--numeric , -n
Show numerical addresses instead of trying to determine symbolic host, port or user names.

fuser 7000/tcp

Find Out Current Working Directory Of a Process
ls -l /proc/3813/cwd
pwdx 3813

cat /proc/3813/environ

Find Out Owner Of a Process
ps aux | grep 3813

lsof Command Example
lsof -i :portNumber
lsof -i tcp:portNumber
lsof -i udp:portNumber
lsof -i :80
lsof -i :80 | grep LISTEN

I Discover an Open Port Which I Don't Recognize At All
The file /etc/services is used to map port numbers and protocols to service names
grep port /etc/services
http://www.cyberciti.biz/faq/what-process-has-open-linux-port/

CAP theorem - Wikipedia, the free encyclopedia

CAP theorem

http://en.wikipedia.org/wiki/CAP_theorem

http://www.slideshare.net/alekbr/cap-theorem

The CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

Consistency (all nodes see the same data at the same time)

Availability:

a guarantee that every request receives a response about whether it was successful or failed

Node failures don't prevent survivors from continuing to operate

Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

According to the theorem, a distributed system cannot satisfy all three of these guarantees at the same time.

You can't have the three at the same time and get an acceptable latency.

http://wiki.apache.org/cassandra/ArchitectureOverview

Cassandra values Availability and Partitioning tolerance (AP). Tradeoffs between consistency and latency are tunable in Cassandra. You can get strong consistency with Cassandra (with an increased latency). But, you can't get row locking: that is a definite win for HBase.

Hbase values Consistency and Partitioning tolerance (CP)

Zookeeper: AP

Google BigTable: CA

Traditional Database: CA

the kernel of truth is that there are trade offs. A system with high partition tolerance and availability (like Cassandra) will sacrifice some consistency in order do it.

https://foundationdb.com/white-papers/the-cap-theorem

Consistency: A read sees all previously completed writes.

Availability: Reads and writes always succeed.

Partition tolerance: Guaranteed properties are maintained even when network failures prevent some machines from communicating with others.

https://cloudant.com/blog/the-cap-theorem/

from Google Plus RSS Feed for 101157854606139706613 http://en.wikipedia.org/wiki/CAP_theorem

via LifeLong Community

Windows Bat: Check whether a service exists SC QUERY ftpsvc > NUL IF E

Windows Bat: Check whether a service exists

SC QUERY ftpsvc > NUL

IF ERRORLEVEL 1060 GOTO MISSING

ECHO EXISTS

http://technet.microsoft.com/en-us/library/cc754599.aspx

http://ss64.com/nt/sc.html

http://stackoverflow.com/questions/3883099/how-does-one-find-out-if-a-windows-service-is-installed-using-preferably-only

from Google Plus RSS Feed for 101157854606139706613 https://plus.google.com/101157854606139706613/posts/UWeM6RpW4Kr

via LifeLong Community

UNLOCKER 1.9.2 BY CEDRICK 'NITCH' COLLOMB

Windows Tools: File Unlocker

http://www.emptyloop.com/unlocker/

http://download.cnet.com/Unlocker/3000-2248_4-10493998.htm

from Google Plus RSS Feed for 101157854606139706613 http://www.emptyloop.com/unlocker

via LifeLong Community

Using Chrome DevTools to Hack Client-Side Only Validation

Using Chrome DevTools to Hack Client-Side Only Validation

http://lifelongprogrammer.blogspot.com/2013/12/using-chrome-devtools-to-hack-client-only-verification.html

from Google Plus RSS Feed for 101157854606139706613 http://lifelongprogrammer.blogspot.com/2013/12/using-chrome-devtools-to-hack-client-only-verification.html

via LifeLong Community

Commonly Used Windows PowerShell Commands

Commonly Used Windows PowerShell Commands

http://lifelongprogrammer.blogspot.com/2013/12/commonly-used-windows-powershell-commands.html

from Google Plus RSS Feed for 101157854606139706613 http://lifelongprogrammer.blogspot.com/2013/12/commonly-used-windows-powershell-commands.html

via LifeLong Community

A Complete DNS Setup Guide on Redhat(CentOS)

A Complete DNS Setup Guide on Redhat(CentOS)

http://lifelongprogrammer.blogspot.com/2013/12/a-complete-dns-setup-guide-on-redhat-centos.html

Describe how to set up DNS server and clients in Redhat or CentOs.

from Google Plus RSS Feed for 101157854606139706613 http://lifelongprogrammer.blogspot.com/2013/12/a-complete-dns-setup-guide-on-redhat-centos.html

via LifeLong Community

Nutch2: Save Entire Title to Solr

Nutch2: Save Entire Title to Solr

http://lifelongprogrammer.blogspot.com/2013/12/nutch2-save-entire-title-to-solr.html

Describe how to configure Nutch2 to save entire title to Solr by changing indexer.max.title.length to a larger value like 500.

from Google Plus RSS Feed for 101157854606139706613 http://lifelongprogrammer.blogspot.com/2013/12/nutch2-save-entire-title-to-solr.html

via LifeLong Community

Using the Set-ExecutionPolicy Cmdlet

Set-ExecutionPolicy

Using the Set-ExecutionPolicy Cmdlet

When hit: cannot be loaded because the execution of scripts is disabled on this system

Set-ExecutionPolicy RemoteSigned

Restricted - No scripts can be run. Windows PowerShell can be used only in interactive mode.

AllSigned - Only scripts signed by a trusted publisher can be run.

RemoteSigned - Downloaded scripts must be signed by a trusted publisher before they can be run.

Unrestricted - No restrictions; all Windows PowerShell scripts can be run.

http://technet.microsoft.com/en-us/library/ee176961.aspx

from Google Plus RSS Feed for 101157854606139706613 http://technet.microsoft.com/en-us/library/ee176961.aspx

via LifeLong Community

Windows Powershell Common Folder/File Operation Create folder mkdir c:

Windows Powershell Common Folder/File Operation

Create folder

mkdir c:\f1\f2\f3

md c:\f1\f2\f3

New-Item c:\f1\f2\f3 -ItemType directory

rm -r c:\f1\f2\f3

Create a file

New-Item c:\f1\f2\f3 -ItemType file -force -value "hello world"

cat c:\f1\f2\f3

from Google Plus RSS Feed for 101157854606139706613 https://plus.google.com/101157854606139706613/posts/CCHHWxdSQsj

via LifeLong Community

How to Make Windows 7 Search File Contents

How to Make Windows 7 Search File Contents

http://www.wikihow.com/Make-Windows-7-Search-File-Contents

from Google Plus RSS Feed for 101157854606139706613 http://www.wikihow.com/Make-Windows-7-Search-File-Contents

via LifeLong Community

Linux Netcat command - The swiss army knife of networking - MyLinuxBook

Linux Netcat NC Command

http://mylinuxbook.com/linux-netcat-command/

http://www.thegeekstuff.com/2012/04/nc-command-examples/

1. Port scanning

$nc -z -v -n 172.31.100.7 21-25

z option tell netcat to use zero IO .i.e the connection is closed as soon as it opens and no actual data exchange take place.

n option tell netcat not to use the DNS lookup for the address.

The netcat utility can be run in the server mode on a specified port listening for incoming connections.

$ nc -l 2389

Also, it can be used in client mode trying to connect on the port(2389) just opened

$ nc localhost 2389

Use Netcat to Transfer Files

we run the server as :

$ nc -l 2389 > test

and run the client as :

cat testfile | nc localhost 2389

Server

$nc -l 1567 < file.txt

Client

$nc -n 172.31.100.7 1567 > file.txt

Server

$nc -l 1567 > file.txt

Client

$nc 172.31.100.23 1567 < file.txt

4. Directory transfer

Server

$tar -cvf – dir_name | nc -l 1567

Client

$nc -n 172.31.100.7 1567 | tar -xvf

Server

$tar -cvf – dir_name| bzip2 -z | nc -l 1567

Compress the archive using the bzip2 utility.

Client

$nc -n 172.31.100.7 1567 | bzip2 -d |tar -xvf -

7. Cloning a device

Server

$dd if=/dev/sda | nc -l 1567

Client

$nc -n 172.31.100.7 1567 | dd of=/dev/sda

dd is a tool which reads the raw data from the disk

8. Opening a shell

Server

$nc -l 1567 -e /bin/bash -i

Client

$nc 172.31.100.7 1567

$nc 172.31.100.7 1567 -p 25

$nc -u 172.31.100.7 1567 -s 172.31.100.5 > file.txt

Netcat Supports Timeouts

nc -w 10 localhost 2389

The connection above would be terminated after 10 seconds.

Force Netcat Server to Stay Up

This behavior can be controlled by using the -k flag at the server side to force the server to stay up even after the client has disconnected.

$ nc -k -l 2389

from Google Plus RSS Feed for 101157854606139706613 http://mylinuxbook.com/linux-netcat-command

via LifeLong Community