« Being a good .NET citizen means certain ... | Main | GUI Front End to Chris Sells' XmlPreComp... »

A fellow emailed me wanting to screen scrape, er, ah, harvest a page that only displays the data he wants with a postback.

Remember what an HTTP GET looks like under the covers:

GET /whatever/page.aspx?param1=value&param2=value

Note that the GET includes no HTTP Body. That's important. With a POST the 'DATA' moves from the QueryString into the HTTP Body, but you can still have stuff in the QueryString.

POST /whatever/page.aspx?optionalotherparam1=value
Content-Type: application/x-www-form-urlencoded
Content-Length: 25
param1=value&param2=value

Note the Content-Type header and the Content-Length, those are important.

A POST is just the verb for when you have an HTTP document. A GET implies you got nothing.

So, in C#, here's a GET:

public static string HttpGet(string URI)
{
   System.Net.WebRequest req = System.Net.WebRequest.Create(URI);
   req.Proxy = new System.Net.WebProxy(ProxyString, true); //true means no proxy
   System.Net.WebResponse resp = req.GetResponse();
   System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());
   return sr.ReadToEnd().Trim();
}

Here's a POST:

public static string HttpPost(string URI, string Parameters)
{
   System.Net.WebRequest req = System.Net.WebRequest.Create(URI);
   req.Proxy = new System.Net.WebProxy(ProxyString, true);
   //Add these, as we're doing a POST
   req.ContentType = "application/x-www-form-urlencoded";
   req.Method = "POST";
   //We need to count how many bytes we're sending. Post'ed Faked Forms should be name=value&
   byte [] bytes = System.Text.Encoding.ASCII.GetBytes(Parameters);
   req.ContentLength = bytes.Length;
   System.IO.Stream os = req.GetRequestStream ();
   os.Write (bytes, 0, bytes.Length); //Push it out there
   os.Close ();
   System.Net.WebResponse resp = req.GetResponse();
   if (resp== null) return null;
   System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());
   return sr.ReadToEnd().Trim();
}

I could and should have put in more 'using' statements, but you get the gist. And, there are other ways to have done this with the BCL, but this is one.

Now, how would you fake an HTTP PostBack? Use a tool like ieHttpHeaders to watch what a real postback looks like, and well, fake it. :) Just hope they don't require unique/encrypted ViewState (via ViewStateUserKey or EnableViewStateMac) for that page, or you're out of luck.



Friday, December 03, 2004 1:04:33 AM (Pacific Standard Time, UTC-08:00)
Whilst ieHttpHeaders is rather handy I've found the small toolbar just doesn't make it easy to read, especially with large requests (*cough* viewstate *cough*)

There's always http://www.fiddlertool.com/ which personally I find more flexible.
Friday, December 03, 2004 6:39:39 AM (Pacific Standard Time, UTC-08:00)
ieHttpHeaders - Broken Link
Friday, December 03, 2004 10:03:02 AM (Pacific Standard Time, UTC-08:00)
I fixed the link.
Friday, December 03, 2004 10:40:38 AM (Pacific Standard Time, UTC-08:00)
To postback the viewstate, it is useful to pick a landing page to hit before doing the postback, so that you can get a legit viewstate and not worry about whether the site has specific expectations of their viewstate. Basically, just pick the page that you would normally HTTP GET before you postback, and then make sure you copy the viewstate from the page to the post. You have to be careful to get the encoding right on the postback, since the viewstate is base64 and some of that isn't application/x-www-form-urlencoded compatible.
Monday, February 27, 2006 12:42:04 PM (Pacific Standard Time, UTC-08:00)
Hi,
First thank you very much for this piece of code :)
I would like to inform you System.Net.WebResponse has a Close() method, I think it should be better if you call it no?
Franck Quintana
Comments are closed.

Contact

Sponsors

Hosting By

Hot Topics

Tags

Calendar

<March 2010>
SunMonTueWedThuFriSat
28123456
78910111213
14151617181920
21222324252627
28293031123
45678910

Archives

March, 2010 (10)
February, 2010 (17)
January, 2010 (13)
December, 2009 (13)
November, 2009 (7)
October, 2009 (19)
September, 2009 (11)
August, 2009 (12)
July, 2009 (21)
June, 2009 (26)
May, 2009 (16)
April, 2009 (13)
March, 2009 (17)
February, 2009 (17)
January, 2009 (18)
December, 2008 (32)
November, 2008 (17)
October, 2008 (22)
September, 2008 (16)
August, 2008 (14)
July, 2008 (25)
June, 2008 (19)
May, 2008 (17)
April, 2008 (17)
March, 2008 (26)
February, 2008 (21)
January, 2008 (28)
December, 2007 (19)
November, 2007 (17)
October, 2007 (31)
September, 2007 (39)
August, 2007 (37)
July, 2007 (43)
June, 2007 (37)
May, 2007 (32)
April, 2007 (38)
March, 2007 (29)
February, 2007 (46)
January, 2007 (31)
December, 2006 (27)
November, 2006 (31)
October, 2006 (32)
September, 2006 (39)
August, 2006 (34)
July, 2006 (40)
June, 2006 (18)
May, 2006 (31)
April, 2006 (34)
March, 2006 (30)
February, 2006 (38)
January, 2006 (44)
December, 2005 (19)
November, 2005 (34)
October, 2005 (24)
September, 2005 (37)
August, 2005 (20)
July, 2005 (24)
June, 2005 (33)
May, 2005 (16)
April, 2005 (22)
March, 2005 (34)
February, 2005 (15)
January, 2005 (37)
December, 2004 (28)
November, 2004 (30)
October, 2004 (34)
September, 2004 (22)
August, 2004 (34)
July, 2004 (18)
June, 2004 (64)
May, 2004 (49)
April, 2004 (21)
March, 2004 (29)
February, 2004 (29)
January, 2004 (36)
December, 2003 (25)
November, 2003 (24)
October, 2003 (59)
September, 2003 (42)
August, 2003 (24)
July, 2003 (44)
June, 2003 (29)
May, 2003 (21)
April, 2003 (30)
March, 2003 (27)
February, 2003 (47)
January, 2003 (50)
December, 2002 (31)
November, 2002 (38)
October, 2002 (44)
September, 2002 (15)
May, 2002 (2)
April, 2002 (4)

Google Ads