scrapy - 带Polipo和的Scrapyd

  显示原文与译文双语对照的内容
0 0

更新: 我现在运行这个命令:


scrapyd-deploy <project_name>

出现这里错误:


504 Connect to localhost:8123 failed: General SOCKS server failure


我正在尝试通过scrapyd部署部署我的爬虫 spider,下面是我使用的命令:


scrapyd-deploy -L <project_name>

我得到以下错误消息:


 Traceback (most recent call last):
 File"/usr/local/bin/scrapyd-deploy", line 269, in <module>
 main()
 File"/usr/local/bin/scrapyd-deploy", line 74, in main
 f = urllib2.urlopen(req)
 File"/usr/lib/python2.7/urllib2.py", line 127, in urlopen
 return _opener.open(url, data, timeout)
 File"/usr/lib/python2.7/urllib2.py", line 410, in open
 response = meth(req, response)
 File"/usr/lib/python2.7/urllib2.py", line 523, in http_response
 'http', request, response, code, msg, hdrs)
 File"/usr/lib/python2.7/urllib2.py", line 448, in error
 return self._call_chain(*args)
 File"/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
 result = func(*args)
 File"/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
 raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not found

以下是我的scrapy.cfg file:


[settings]
default = <project_name>.settings

[deploy:<project_name>]
url = http://localhost:8123
project = <project_name>
eggs_dir = eggs
logs_dir = logs
items_dir = items
jobs_to_keep = 5
dbs_dir = dbs
max_proc = 0
max_proc_per_cpu = 4
finished_to_keep = 100
poll_interval = 5
http_port = 8123
debug = on
runner = scrapyd.runner
application = scrapyd.app.application
launcher = scrapyd.launcher.Launcher

[services]
schedule.json = scrapyd.webservice.Schedule
cancel.json = scrapyd.webservice.Cancel
addversion.json = scrapyd.webservice.AddVersion
listprojects.json = scrapyd.webservice.ListProjects
listversions.json = scrapyd.webservice.ListVersions
listspiders.json = scrapyd.webservice.ListSpiders
delproject.json = scrapyd.webservice.DeleteProject
delversion.json = scrapyd.webservice.DeleteVersion
listjobs.json = scrapyd.webservice.ListJobs

我正在运行tor和 polipo,端口 'http://localhost:8123' 上有polipo代理。 我可以在没有任何问题的情况下执行wget和下载。 代理工作正常,我可以连接到互联网等。 请询问是否需要更详细的说明。

谢谢!

时间: 原作者:

0 0

urllib2.HTTPError: HTTP错误 404: 未找到

未达到 url 。

0 0

/var/log/polipo/polipo.log 有什么有趣的? tail -100/var/log/polipo/polipo.log

原作者:
0 0

显然这是因为我忘了运行主命令。 这是很容易错误的,因为它在文档的概述页面中提到,而不是部署页面。 以下是命令:

 
scrapyd

 
原作者:
0 0

504 Connect to localhost:8123 failed: General SOCKS server failure

you Polipo连接到 localhost:8123 ;Polipo将请求传递给 tor,返回失败结果,该结果由 Polipo ("通用SOCKS服务器故障") 返回。


url = http://localhost:8123

这当然不是你想要的。

 
http_port = 8123

 

另外,我非常确信你不希望在Polipo上运行 scrapyd 。

原作者:
...