Scott Hanselman

Target: Referral Spam in dasBlog

December 4, '04 Comments [1] Posted in ASP.NET | DasBlog | XML
Sponsored By

I've pretty much solved the comment-Spam problem (only one person has voiced their distaste so far) but a recently perusal of my logs and older posts indicated a ridiculous amount of referral spam. 

This is when someone hits a post on your site and has changed/hacked the HTTP Referrer Header to indicate where they came from. If your blog adds this referrer to the page, as most to, you've just linked to Hot Gay Sex (not that there's anything wrong with Hot Sex between consenting adults : ) ) or whatever by their actions.

The story goes when Google comes around, they see that you've linked to them, and they get Google Juice via the Page Rank System.

Not only is this potentially offensive to my readers, it also obscures the posts and comments when they are filled with referrals.

Potential Solutions:

  • Stop printing out referrals on my pages.
    • Personally, I like to see them, and I think they provide value to the reader so they can see other places with information of interest. It also promotes cross-linking between my peer blogs.
  • Modify dasBlog to NOT add icky referrals.
    • This would be idea. However, it will likely be in version 1.7 in some way, either via James Snape's whitelist solution (I think a whitelist removes the point of referrals, and I'll greatly prefer a keyword-based black list) or some other technique.
    • I've avoided running a "private build" of dasBlog so far (as evidenced by my care in creating the CAPTCHA solution without recompiling) and I'd to continue as such
  • Clean the .xml files occasionally with a process
    • This is quick, easy, can be automated, and will work in the short term for me as I await dasBlog 1.7.

So, here was an opportunity to use the only dev stuff I have on my home machine, Visual Studio C# 2005 Express

Here's what I did. Use at your own risk, back up your /content directory, and know that this will only have to run on your "*.dayextra.xml" files from dasBlog. No error handling, no warrenty, but it worked for me. Enjoy.

Usage: TrackingFilter "c:\yourdasblogcontentdirectory"

File Attachment: TrackingFilter.zip (9 KB) (for VS.NET 2005, I don't know if it works in 2003)

WARNING: The words I put in the .config file are ; delimited and are unquestionably offensive. Not only do they include most of George Carlin's words but they also include "bloglines" and "artima" because they don't provide a value in my referral list.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

GUI Front End to Chris Sells' XmlPreCompiler - For Debugging XmlSerialization Errors

December 3, '04 Comments [0] Posted in ASP.NET | XmlSerializer | Bugs | Tools
Sponsored By

I spend a lot of time with the XmlSerializer (I personally dig it immensely, and I think too many people complain about it, but anyway) and while I put up an article on how to debug directly into the generated assemblies, I noticed that Mathew Nolton has a GUI Front-End to Chris's XmlSerializerPreCompiler.

The tool will check to see if a type can be serialized by the XmlSerializer and shows any compiler errors that happen behind the scenes. +1 for Useful, thanks Chris and Mathew.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

HTTP POSTs and HTTP GETs with WebClient and C# and Faking a PostBack

December 3, '04 Comments [5] Posted in ASP.NET | ViewState | Tools
Sponsored By

A fellow emailed me wanting to screen scrape, er, ah, harvest a page that only displays the data he wants with a postback.

Remember what an HTTP GET looks like under the covers:

GET /whatever/page.aspx?param1=value&param2=value

Note that the GET includes no HTTP Body. That's important. With a POST the 'DATA' moves from the QueryString into the HTTP Body, but you can still have stuff in the QueryString.

POST /whatever/page.aspx?optionalotherparam1=value
Content-Type: application/x-www-form-urlencoded
Content-Length: 25
param1=value&param2=value

Note the Content-Type header and the Content-Length, those are important.

A POST is just the verb for when you have an HTTP document. A GET implies you got nothing.

So, in C#, here's a GET:

public static string HttpGet(string URI)
{
   System.Net.WebRequest req = System.Net.WebRequest.Create(URI);
   req.Proxy = new System.Net.WebProxy(ProxyString, true); //true means no proxy
   System.Net.WebResponse resp = req.GetResponse();
   System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());
   return sr.ReadToEnd().Trim();
}

Here's a POST:

public static string HttpPost(string URI, string Parameters)
{
   System.Net.WebRequest req = System.Net.WebRequest.Create(URI);
   req.Proxy = new System.Net.WebProxy(ProxyString, true);
   //Add these, as we're doing a POST
   req.ContentType = "application/x-www-form-urlencoded";
   req.Method = "POST";
   //We need to count how many bytes we're sending. Post'ed Faked Forms should be name=value&
   byte [] bytes = System.Text.Encoding.ASCII.GetBytes(Parameters);
   req.ContentLength = bytes.Length;
   System.IO.Stream os = req.GetRequestStream ();
   os.Write (bytes, 0, bytes.Length); //Push it out there
   os.Close ();
   System.Net.WebResponse resp = req.GetResponse();
   if (resp== null) return null;
   System.IO.StreamReader sr = new System.IO.StreamReader(resp.GetResponseStream());
   return sr.ReadToEnd().Trim();
}

I could and should have put in more 'using' statements, but you get the gist. And, there are other ways to have done this with the BCL, but this is one.

Now, how would you fake an HTTP PostBack? Use a tool like ieHttpHeaders to watch what a real postback looks like, and well, fake it. :) Just hope they don't require unique/encrypted ViewState (via ViewStateUserKey or EnableViewStateMac) for that page, or you're out of luck.

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Being a good .NET citizen means certain things...start with your debugging skills

December 3, '04 Comments [5] Posted in ASP.NET | Web Services | Bugs
Sponsored By

I've not been one to work the newsgroups, answering questions. I probably should. I'm more of a one on one person, and I tend to go the extra mile when folks (largely strangers) ask me technical questions. I've had email threads 10-deep with total strangers on technical questions, and only at the end do I say, "Um, do I know you?"

I haven't done what Scott Mitchell wisely did and setup a "Getting Help" policy, but I'm quickly getting there. I'll happily answer your question for $75 too, satisfaction guaranteed, and I'll blog the answer. I've done this hundreds of times for free. :)

Anyway, the point of this post was this: People, for crying out loud, debug a few things before you ask for help. If you don't know how to debug, learn or ask someone to teach you.

So, I present:

The Hanselman List of .NET Debugging Dos and Don'ts

Don't - Say "Hey, I got a NullReferenceException," what's the problem?
Do - Provide a Stack Trace/Dump with the line number it likely happened on.

Don't - Get deep into your complicated program, find a bug and insist it's BillGs fault.
Do - Reproduce the bug in some simple test program and tell the world. Remember, 9/10 times it's you.

Don't - Decide there's a problem if you don't know the preferred behavior.
Do - Always Assert your assumptions. If null can happen, check for it. BUT, if null must never happen it's time for a Debug.Assert

Don't - Move code around blindly, somehow fix your bug, ignore it and keep coding. Programming by Coincidence!
Do - Understand your program fully. Remember what Andy and Dave say about lucky folks who step into minefields and don't die. Just because you didn't die, doesn't mean there aren't mines!

Don't - Reformat or "pave" something because you don't know what's wrong. If you get a spot on your carpet, fix the spot. Don't lay new carpet.
Do - Know enough about your environment to know what your program's dependencies are. If your registry settings can get boogered, Debug.Assert that you are getting good values from the registry.

Don't - Get overly frustrated with Assembly loading/versioning/policy. At least the Assembly Loader follows clear, set, rules.
Do - Make a folder called C:\FusionLogs, then go to the registry in HKLM:\Software\Microsoft\Fusion and make a DWORD value LogFailures=1 and string value LogPath=C:\FusionLogs. Every AppDomain that has a binding failure or weird redirect will get logged. Know: What assembly you want, what they looked for, what you got. Know where Assemblies are searched for.

Don't - Avoid debugging. Debugging in .NET is easier than ever before. Remote debugging and AttachToProcess are gifts. Don't stop at a point in the call stack if you can keep going by finding PDBs.
Do - Keep your Source and PDBs in the same location. We keep ZIPs of every build's PDBs. Just today we dug up 9-month old PDBs and source (from CVS) to debug into some confusion. Not saving those PDBs would have screwed us. Create a Symbol Server.

Don't - Limit yourself to the QuickWatch. Learn what VS.NET has to offer.
Do - Use the Immediate Window to test theories. Remember that you can perform Casts in the Watch Window. Remember that you can drag and drop variables into the Watch Window. Remember you have 4 Watch Windows, Autos, Locals, not to mention. Learn how to use Conditional Breakpoints!

Don't - not debug something just because you can't figure out how to launch the process from the VS.NET Project Properties.
Do - Debug|Processes|Attach to attach to processes that have your DLL loaded. Use ProcExp from SysInternals as a better Task Manager to see .NET processes, as well as a system-wide DLL search. Who's got you loaded?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

FireBlogging - The View From My House

December 2, '04 Comments [0] Posted in Musings
Sponsored By

It's 1:09am on Thursday, December 2nd 2004, and here's the view from my bedroom window. The house next door is burning and we share a wooden fence. Pretty exciting stuff! Fortunately, I'm not too worried, I'm in a family of fire-fighters.

CIMG2539 (Small) CIMG2546 (Small)

CIMG2547 (Small) CIMG2540 (Small)

P.S. For those of you not in the U.S., most, if not all, residential housing (especially in the Suburban Western U.S.) is made of wood and quite flammable. My wife's still not used to this fact, and her family isn't impressed that our family fights fires. :)

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.