飞道的博客

php的curl携带header请求头信息实现http访问

407人阅读  评论(0)

 导读:

curl请求时添加请求头信息可以模拟真人操作,不容易被当成是爬虫机器人(采集),从而可以绕过Incapsula等安全验证机制。

1、首先使用浏览器(示例使用的是火狐浏览器)访问接口网址,使用F12调试,查看请求头信息,如下:

2、实现代码:


  
  1. <?php
  2. /**
  3. * 开始访问请求
  4. * @param $url
  5. * @return bool|string
  6. */
  7. function fetch_url($url) {
  8. $header = FormatHeader($url);
  9. $useragent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:83.0) Gecko/20100101 Firefox/83.0';
  10. $timeout= 120;
  11. $ch = curl_init($url);
  12. curl_setopt($ch, CURLOPT_FAILONERROR, true);
  13. //设置请求头信息
  14. curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
  15. //不取得返回头信息
  16. curl_setopt($ch, CURLOPT_HEADER, 0);
  17. // 关闭https验证
  18. curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
  19. curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
  20. curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true );
  21. curl_setopt($ch, CURLOPT_ENCODING, "" );
  22. curl_setopt($ch, CURLOPT_RETURNTRANSFER, true );
  23. curl_setopt($ch, CURLOPT_AUTOREFERER, true );
  24. curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout );
  25. curl_setopt($ch, CURLOPT_TIMEOUT, $timeout );
  26. curl_setopt($ch, CURLOPT_MAXREDIRS, 10 );
  27. curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
  28. $content = curl_exec($ch);
  29. if(curl_errno($ch))
  30. {
  31. echo 'Error:' . curl_error($ch);
  32. }
  33. else
  34. {
  35. return $content;
  36. }
  37. curl_close($ch);
  38. }
  39. //添加请求头
  40. function FormatHeader($url)
  41. {
  42. // 解析url
  43. $temp = parse_url($url);
  44. $query = isset($temp[ 'query']) ? $temp[ 'query'] : '';
  45. $path = isset($temp[ 'path']) ? $temp[ 'path'] : '/';
  46. $header = array (
  47. "POST {$path}?{$query} HTTP/1.1",
  48. "Host: {$temp['host']}",
  49. "Referer: http://{$temp['host']}/",
  50. "Content-Type: text/xml; charset=utf-8",
  51. 'Accept: application/json, text/javascript, */*; q=0.01',
  52. 'Accept-Encoding:gzip, deflate, br',
  53. 'Accept-Language:zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2',
  54. 'Connection:keep-alive',
  55. 'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:83.0) Gecko/20100101 Firefox/83.0',
  56. 'X-Requested-With: XMLHttpRequest',
  57. );
  58. return $header;
  59. }
  60. ?>

3、调用示例:


  
  1. <?php
  2. //lcg_value() 返回范围为 (0, 1) 的一个伪随机数
  3. $url= "http://www.xxx.com/getdata.php?v=".lcg_value();
  4. //访问网址
  5. $html = fetch_url($url);

 


转载:https://blog.csdn.net/qq15577969/article/details/110913311
查看评论
* 以上用户言论只代表其个人观点,不代表本网站的观点或立场