Scott Hanselman

The Weekly Source Code 53 - "Get'er Done" Edition - XML in the left hand becomes HTTP POSTs in the right hand

June 26, 2010 Comment on this post [13] Posted in Open Source | Source Code
Sponsored By

I wrote some code tonight in about ten minutes with a "Get'er Done" attitude. We all do that (I hope). It's one-time code that will solve a one-time problem. As such, it isn't always pretty, but it often is interesting.

I've got this silly website called OverheardAtHome that is a collection of silly quotes and stories that my kids (and your kids) have said around the house. It was running on DasBlog (just like this blog) for the last year or more, but the sheer workflow of populating the site was getting tiredsome. DasBlog isn't setup for screening external submissions and promoting them to posts, and I wasn't really interested in extending DasBlog in that way.

Instead, I needed to move OverheardAtHome to a hosted blogging solution, preferably a nice free one as it doesn't make any money. It's just a hobby. I like Tumblr so I figured I put it there. Tumblr has a very basic HTTP API and their native UI supports User Submissions, so it seemed like a win.

DasBlog stores its content not in a database, but rather in an XML file per day. I've got a few hundred XML files that make up the whole of the content on OverheardAtHome and it's very basic stuff.

Here's what I wrote, using the Tumblr API from CodePlex written by (I believe) Jeremy "madkidd" Hodges, who is also a Developer on Graffiti CMS, coincidentally. It's a nice little abstraction on top of HttpWebRequest that uses a little HttpHelper class from "rakker."

I figured I could use PowerShell or something script-like, but this was very fast to write. There's no error handling, but interestingly (or not), there were no errors in hundreds of posts.

using System;
using System.Linq;
using System.IO;
using System.Xml.Linq;
using System.Threading;

namespace TumblrAPI.ConsoleApp
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Started...");

TumblrAPI.Authentication.Email = "myemail@notyours.com"; // Console.ReadLine();
TumblrAPI.Authentication.Password = "poopypants"; //Console.ReadLine();
Console.WriteLine(TumblrAPI.Authentication.Authenticate().ToString());

if (TumblrAPI.Authentication.Status == TumblrAPI.AuthenticationStatus.Valid) {
Console.WriteLine("Now make some posts...");

DirectoryInfo di = new DirectoryInfo(@"C:\overheardathome\xml");
//Get the DasBlog XML files, they are like <entry><title/><content/></entry> and stuff
FileSystemInfo[] files = di.GetFileSystemInfos("*.dayentry.xml");
var orderedFiles = files.OrderBy(f => f.Name);

XNamespace ns = "urn:newtelligence-com:dasblog:runtime:data";
foreach (FileSystemInfo file in orderedFiles)
{
XDocument xml = XDocument.Load(file.FullName);
var posts = from p in xml.Descendants(ns + "Entry") select p;

foreach (var post in posts)
{
TumblrAPI.Post.Text t = new TumblrAPI.Post.Text();
t.Title = (string)post.Element(ns + "Title");
t.Body = (string)post.Element(ns + "Content");
Thread.Sleep(500); //Tumblr will API limit me if I bash on them.
Console.WriteLine(" Response from text post: {0}", t.Publish());
}
}
}

Console.WriteLine("Done, press any key...");
Console.ReadLine();
}
}
}

Comments? What's a better pattern for left-hand/right-hand bulk crap like this, Dear Reader?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Hosting By
Hosted in an Azure App Service
June 26, 2010 8:07
Cool. Making it a bit dirty with my fluent dynamic wrapper for Xml - Something like
 foreach (var file in orderedFiles)
                {
                    dynamic doc = XDocument.Load(file.FullName).Root.ToElastic();
                    foreach (var entry in doc.entry_All)
                    {
                        TumblrAPI.Post.Text t = new TumblrAPI.Post.Text();
                        t.Title = ~entry.Title;
                        t.Body = ~entry.Content;
                        Thread.Sleep(500); //Tumblr will API limit me if I bash on them.
                        Console.WriteLine("  Response from text post: {0}", t.Publish());
                    }
                }
ToElastic() is here - http://amazedsaint.blogspot.com/2010/02/introducing-elasticobject-for-net-40.html
June 26, 2010 8:08
Why GetFileSystemInfos instead of GetFiles?
June 26, 2010 9:29
Nice - as you say, once in a while it is not about architecture, extensibility and future-proofing, but just getting a single task done right here, right now. However there always remains the danger of "I know, I once had this tool that did XYZ - maybe I could extend it to do ABC" - and before you know it, a new "product" has been born, based on something never meant to even live that long. Granted, not a problem with this little snippet here, but in my corporate environment more than once "one-off tools" were pushed all the way to a critical part of daily operations :-(
June 26, 2010 16:08
@amazedsaint: that XML-to-dynamic tool looks great! You should pop the source up on GitHub (I was only able to find it in a zip file on your website). Another way to simplify the code is to move the logic for getting the xml nodes (lines 28-31) and the logic for turning an xml node into a Tumblr object (lines 33-37) into Linq statements. It may not drastically reduce LOC, but it does get the logic for some of the looping out of your hands and into the LINQ library (where there are some optimizations to be had) and - in my opinion - makes it easier to read what is going on. Using method syntax:
var posts =
    new DirectoryInfo(@"C:\overheardathome\xml")
    .GetFileSystemInfos("*.dayentry.xml")
    .OrderBy(file => file.Name)
    .SelectMany(file => XDocument.Load(file.FullName).Root.Descendants(ns + "Entry"))
    .Select(entry => new TumblrAPI.Post.Text{
        Title = (string)entry.Element(ns + "Title"),
        Body = (string)entry.Element(ns + "Content")
    });
or, using query syntax:
var posts =
    from file in DirectoryInfo(@"C:\overheardathome\xml").GetFileSystemInfos("*.dayentry.xml")
    orderby file.Name
    from entry in XDocument.Load(file.FullName).Root.Descendants(ns + "Entry")
    select new TumblrAPI.Post.Text{
        Title = (string)entry.Element(ns + "Title"),
        Body = (string)entry.Element(ns + "Content")
    };
I think this makes the method much more readable overall: http://gist.github.com/454149
June 26, 2010 17:57
"poopypants" - Very good, Scott. Gave me a laugh. I'm inclined to agree with Troy about LINQfing the code. Those loops are truly horrible to read. And thanks Troy for posting that code - gave me a few ideas.
June 26, 2010 21:26
Automapper? http://automapper.codeplex.com/ Would not use something special if it's done in one place. If it starts spreading all over the place, maybe a factory function OR use Automapper. And...: if (TumblrAPI.Authentication.Status == TumblrAPI.AuthenticationStatus.Valid) I'd use a negative condition instead, or else put the code in a separate method also. And thanks for telling me about GetFileSystemInfos() :D
June 27, 2010 15:35
Respectfully disagree on readability of loops & LINQ. But, whatever floats your boat. I took the exercise of PowerShell, which would've been my tool of choice. Add-Type -Path "Path to TumblrAPI" write-host "Started..." [TumblrAPI.Authentication]::Email = "myemail@notyours.com" [TumblrAPI.Authentication]::Password = "poopypants" [TumblrAPI.Authentication]::Authenticate() if ([TumblrAPI.Authentication]::Status -ne [TumblrAPI.AuthenticationStatus]::Valid) { write-host "Unable to authenticate with Tumblr." return } write-host "Now make some posts..." #get-childitem orders by name by default foreach ($file in (gci "c:\overheardathome\xml\*.dayentry.xml")) { $doc = [xml](get-content $file) #need a full path here, can't tell from your query what the fully-qualified value is foreach($post in $doc.Entries.Entry) { $text = new-object TumblrAPI.Post.Text #again, may be off on the path within the node, fine-tune this $text.Title = $post.Title $text.Body = $post.Content start-sleep 500 write-host "Response from text post: " + $text.Publish() } } read-host "Done, press return..."
June 27, 2010 16:12
@Troy The source code is included in the zip. Also, here is an example implementation in ASP.NET MVC - A 10 minute twitter search app using dynamic view model http://amazedsaint.blogspot.com/2010/02/10-minute-twitter-search-app-using-duck.html
June 27, 2010 16:22
@amazedsaint: Yes, I already downloaded the zip file and took a look at both the library and the example app. I was suggesting that you upload the code to an actual code repository (such as GitHub - other options are CodePlex, GoogleCode, etc) so that: 1. If your blog ever goes down people can still download the source. 2. People can view the source without having to download code, or optionally download a binary without having to download & compile the source. 3. There is a place to post questions/issues (besides as comments to your post). 4. GitHub in particular makes it easy for others to fork your code, make their own changes to it, and then let you know about it (so you can optionally merge those changes back into your project).
June 28, 2010 9:13
@Troy sure man, will do that.
June 28, 2010 17:51
Along the same lines as the original post, I wondered if anyone had any thoughts on a scenario where the left hand is XML (from a REST service for example), and the right hand was an Entity Framework model with SQL Server backend, and you needed to copy/sync regularly from the left to the right. REST (using XMLSerializer) -> Entity Framework classes In this case, I end up with really annoying duplicate classes and all the fugly foreach loops. For example, I'm pulling down project data from a third party REST-based system nightly, and syncing that with an on-site SQL database for reporting. I have a Project class decorated with all the XML attributes for XML deserialization from the left hand, but then have an Entity Framework model that has its own version of a Project class (just named differently) for committing the data on the right hand, and then foreach loops to copy the data between the two similar objects. I can't seem to find a good way to have one single class that is smart enough to be aware of both sides.
June 30, 2010 20:06
var posts = from p in xml.Descendants(ns + "Entry") select p;
Why not simply:
var posts = xml.Descendants(ns + "Entry");
The result is virtually identical; your version simply wraps the anonymous iterator returned from the XContainer.Descendands method in another anonymous iterator from the Enumerable.Select method.
July 01, 2010 15:30
Hey Scott, I just checked out OverheardAtHome and it is hilarious, as well as nicely designed—I like the simple, clean look. I’ve created blogs on Blogspot and Wordpress, but have not tried writing my own codes for them. Have you tried using those sites to host your blog? Is it difficult to submit codes for them, compared to Tumblr?

Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.