df_百度搜索

# Python教程WEB安全篇 0x00 概述 ------- * * * 本文从实例代码出发，讲解了Python在WEB安全分析中的作用，以最基础的示例向读者展示了Python如何解析、获取、以及处理各种类型的WEB页面系统环境：kali + beautifulsoup + mechanize，由于不涉及底层驱动设计，文中的示例代码可以在任意平台使用，当然无论什么平台都要安装好所用的插件。 0x01 利用python获取WEB页面 -------------------- * * * ``` Python 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)] on win32 Type "copyright", "credits" or "license()" for more information. >>> import urllib ``` 首先引入urllib以继续下面的分析 ``` >>> httpResponse = urllib.urlopen("http://www.baidu.com") ``` 以百度为例获取http响应 ``` >>> httpResponse.code 200 ``` 状态为200 OK ``` >>> print httpResponse.read()[0:500] ``` 由于篇幅限制，只显示前500好啦 ``` >> for header,value in httpResponse.headers.items() : print header+':'+value bdqid:0xeb89374a00028e2e x-powered-by:HPHP set-cookie:BAIDUID=0C926CCF670378EAAA0BD29C611B3AE8:FG=1; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com, BDSVRTM=0; path=/, H_PS_PSSID=5615_4392_1423_7650_7571_6996_7445_7539_6505_6018_7254_7607_7134_7666_7415_7572_7580_7475; path=/; domain=.baidu.com expires:Tue, 15 Jul 2014 02:37:00 GMT vary:Accept-Encoding bduserid:0 server:BWS/1.1 connection:Close cxy_all:baidu+776b3a548a71afebd09c6640f9af5559 cache-control:private date:Tue, 15 Jul 2014 02:37:47 GMT p3p:CP=" OTI DSP COR IVA OUR IND COM " content-type:text/html; charset=utf-8 bdpagetype:1 >>> url = http://www.baidu.com/s?wd=df&rsv_spt=1 ``` 完整的url用来获取http页面 ``` >>> base_url = http://www.baidu.com ``` 基础url ``` >>> args = {'wd':'df','rsv_spt':1} ``` 传参单独构造 ``` >>> encode_args = urllib.urlencode(args) ``` Urlencode可以编码url形式 ``` >>> fp2=urllib.urlopen(base_url+'/s?'+encode_args) ``` 重新尝试以这样的方式获取WEB页面 ``` >>> print fp2.read()[0:500].decode("utf-8") ``` 由于页面是utf-8的，因此解码中文自己设置 ``` df_百度搜索