|
||||||||
|
|
#1
|
|
hi all ! there is a problem confused me, i create a socket ,and connect the web host at 80, get the html, when read data form fd(which is created above), it read some data seems not belong to the web site's html. ex: i read the www.google.cn ,it return following: HTTP/1.1 200 OK Cache-Control: private Content-Type: text/html; charset=GB2312 Set-Cookie: PREF=ID=2089898e46137a4a:NW=1:TM=1181662196:LM=118 1662196:S=T5zGUwA1MR8QBCoI; expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com Server: GWS/2.1 Transfer-Encoding: chunked X-Google-Backends: prcsat-gfe.l.google.com:80,mctf10:80 X-Google-Service: www X-Google-Request-Trace: mctf10:80,prcsat-gfe.l.google.com:80,mctf10:80 Date: Tue, 12 Jun 2007 15:29:26 GMT bcf -----> here is the data confused me <html><head><meta http-equiv="content-type" content="text/html; charset=GB2312"><title>Google</title><style><!-- body,td,a,p,.h{font-family:""} ..h{font-size:20px} ..h{color:#3366cc} ..q{color:#00c} --></style> <script> <!-- function sf(){document.f.q.focus();} // --> </script> </head><body bgcolor=#ffffff text=#000000 link=#0000cc vlink=#551a8b alink=#ff0000 onload="sf();if(document.images){new Image().src='/ images/nav_logo3.png'}" topmargin=3 marginheight=3><div align=right id=guser style="font-size:84%;padding-bottom:4px" width=100%><nobr><a href="https://www.google.com/accounts/Login?continue=http:// 203.208.33.101/& hl=zh-CN">µÇ¼</a></nobr></div><center><br id=lgpd><table cellpadding=0 cellspacing=0 border=0><tr><td align=right valign=bottom><img src=images/hp0.gif width=158 height=78 alt="Google"></td><td valign=bottom><img src=images/hp1.gif width=50 height=78 alt=""></td><td valign=bottom><img src=images/hp2.gif width=68 height=78 alt=""></td></tr><tr><td class=h align=right valign=top><b></b></td><td valign=top><img src=images/hp3.gif width=50 height=32 alt=""></td><td valign=top class=h><font color=#666666 style=font-size:16px><b>ÖÐÎÄ(¼òÌå)</b></font>< /td></tr></table><br><form action="/search" name=f><style>#lgpd{display:none}</style><script defer><!-- //--> </script><table border=0 cellspacing=0 cellpadding=4><tr><td nowrap><font size=-1><b>ÍøÒ³</b> <a class=q href="http:// images.google.com/imghp?ie=GB2312&oe=GB2312&hl=zh-CN& tab=wi">ͼƬ</ a> <a class=q href="http://news.google.com/nwshp? ie=GB2312&oe=GB2312&hl=zh-CN& tab=wn">×Ê Ñ¶</a> <a class=q href="http://groups.google.com/grphp?ie=GB2312&oe=GB2312&hl=zh-CN& tab=wg">ÂÛ̳</a> <b><a href="/intl/zh-CN/options/" class=q>¸ü¶à </a></ b></font></td></tr> </table><table cellpadding=0 cellspacing=0><tr valign=top><td width=25%> </td><td align=center nowrap><input name=hl type=hidden value=zh-CN><input type=hidden name=ie value="GB2312"><input maxlength=2048 name=q size=55 title="GoogleËÑË÷" value=""><br><input name=btnG type=submit value="Google ËÑË÷"><input name=btnI type=submit value="ÊÖÆø²»´í"></td><td nowrap width=25%><font size=-1> <a href=/advanced_search?hl=zh-CN>¸ß¼¶ËÑË÷</a><br> <a href=/ preferences?hl=zh-CN>ʹÓÃÆ«ºÃ</a><br> <a href=/language_tools?hl=zh-CN>ÓïÑÔ¹¤ ¾ß</a></font></td> </tr><tr><td align=center colspan=3><font size=-1><input id=all type=radio name=lr value="" checked><label for=all>ËùÓÐÍøÒ³ </label><input id=ch type=radio name=lr value="lang_zh- CN|lang_zh-TW"><label for=ch>ÖÐÎÄÍøÒ³ </label><input id=il type=radio name=lr value="lang_zh-CN"><label for=il>¼òÌå ÖÐÎÄÍøÒ³ </label></font></ td></tr></table></form><br><br><font size=-1><a href="/intl/zh-CN/ ads/">¹ã¸æ¼Æ»®</a> - <a href="/intl/zh-CN/about.html">Google ´óÈ«</a> - <a href=http://www.google.com/ncr>Google.com in English</a></ font><p><font size=-1> 2007 Google</font></p></center></body></ 5 -----> and here is the data confused me html> 0 ----->and here is the data confused me why this happen? can someone tell me the thing i should care? thanks ! step |
|
#2
|
|||
|
|||
|
On Jun 12, 8:55 am, step <fxl...@gmail.com> wrote:
> HTTP/1.1 200 OK I'm betting you sent 'HTTP/1.1' in your query. > 0 ----->and here is the data confused me > > why this happen? can someone tell me the thing i should care? thanks ! Did you read the HTTP 1.1 specification? DO NOT EVER CLAIM TO SUPPORT A PROTOCOL YOU DO NOT ACTUALLY SUPPORT. DS |
|
#3
|
|||
|
|||
|
On Jun 12, 11:55 am, step <fxl...@gmail.com> wrote:
> hi all ! > > there is a problem confused me, i create a socket ,and connect > the web host at 80, get the html, when read data form fd(which is > created above), it read some data seems not belong to the web site's > html. ex: > i read thewww.google.cn,it return following: > HTTP/1.1 200 OK > Cache-Control: private > Content-Type: text/html; charset=GB2312 > Set-Cookie: > PREF=ID=2089898e46137a4a:NW=1:TM=1181662196:LM=118 1662196:S=T5zGUwA1MR8QBCoI; > expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com > Server: GWS/2.1 > Transfer-Encoding: chunked > X-Google-Backends: prcsat-gfe.l.google.com:80,mctf10:80 > X-Google-Service: www > X-Google-Request-Trace: mctf10:80,prcsat-gfe.l.google.com:80,mctf10:80 > Date: Tue, 12 Jun 2007 15:29:26 GMT > > bcf -----> here is the data confused me > <html><head><meta http-equiv="content-type" content="text/html; [snip] > font><p><font size=-1> 2007 Google</font></p></center></body></ > 5 -----> and here is the data confused me > html> > 0 ----->and here is the data confused me > > why this happen? can someone tell me the thing i should care? thanks ! To augment David's response, what you see is the HTTP "chunked" response control information. You want to take a look at section 3.6.1 of RFC 2616, where it tells you about the format of a "chunked" response. >From the RFC: 3.6.1 Chunked Transfer Coding The chunked encoding modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer containing entity-header fields. This allows dynamically produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message. [snip modified BNF for chunk encoding] The chunk-size field is a string of hex digits indicating the size of the chunk. The chunked encoding is ended by any chunk whose size is zero, followed by the trailer, which is terminated by an empty line. The trailer allows the sender to include additional HTTP header fields at the end of the message. The Trailer header field can be used to indicate which header fields are included in a trailer (see section 14.40). |
![]() |
| Tags |
| html, linux, problem, read, socket |
| Thread Tools | |
| Display Modes | |
|
|