|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.nutch.protocol.ftp.Ftp
public class Ftp
Ftp.java deals with ftp: scheme. Configurable parameters are defined under "FTP properties" section in ./conf/nutch-default.xml or similar.
| Field Summary | |
|---|---|
static org.apache.commons.logging.Log |
LOG
|
| Fields inherited from interface org.apache.nutch.protocol.Protocol |
|---|
CHECK_BLOCKING, CHECK_ROBOTS, X_POINT_ID |
| Constructor Summary | |
|---|---|
Ftp()
|
|
| Method Summary | |
|---|---|
protected void |
finalize()
|
org.apache.hadoop.conf.Configuration |
getConf()
|
ProtocolOutput |
getProtocolOutput(org.apache.hadoop.io.Text url,
CrawlDatum datum)
Returns the Content for a fetchlist entry. |
RobotRules |
getRobotRules(org.apache.hadoop.io.Text url,
CrawlDatum datum)
Retrieve robot rules applicable for this url. |
static void |
main(String[] args)
For debugging. |
void |
setConf(org.apache.hadoop.conf.Configuration conf)
|
void |
setFollowTalk(boolean followTalk)
Set followTalk |
void |
setKeepConnection(boolean keepConnection)
Set keepConnection |
void |
setMaxContentLength(int length)
Set the point at which content is truncated. |
void |
setTimeout(int to)
Set the timeout. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final org.apache.commons.logging.Log LOG
| Constructor Detail |
|---|
public Ftp()
| Method Detail |
|---|
public void setTimeout(int to)
public void setMaxContentLength(int length)
public void setFollowTalk(boolean followTalk)
public void setKeepConnection(boolean keepConnection)
public ProtocolOutput getProtocolOutput(org.apache.hadoop.io.Text url,
CrawlDatum datum)
ProtocolContent for a fetchlist entry.
getProtocolOutput in interface Protocolprotected void finalize()
finalize in class Object
public static void main(String[] args)
throws Exception
Exceptionpublic void setConf(org.apache.hadoop.conf.Configuration conf)
setConf in interface org.apache.hadoop.conf.Configurablepublic org.apache.hadoop.conf.Configuration getConf()
getConf in interface org.apache.hadoop.conf.Configurable
public RobotRules getRobotRules(org.apache.hadoop.io.Text url,
CrawlDatum datum)
Protocol
getRobotRules in interface Protocolurl - url to checkdatum - page datum
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||