Scott Hanselman

Back to Basics - Keep it Simple and Develop Your Sense of Smell - From Linq To CSV

February 4, '10 Comments [48] Posted in ASP.NET | Back to Basics
Sponsored By

I was working with a friend recently on a side thing they were doing. They wanted to create an "Export" function for some small bit of data and start it from their website.

  • You'd hit a URL after logging in
  • Some data would come out of a database
  • You'd get a .CSV file downloaded
  • You could open it in Excel or whatever.

I spoke to my friend and they said it was cool to share their code for this post. This post isn't meant to be a WTF or OMG look at that code, as is it meant to talk about some of the underlying issues. There's few things going on here and it's not all their fault, but it smells.

  • They are using a Page when a IHttpHandler will do.
    • Not a huge deal, but there's overhead in making a Page, and they're not using any of the things that make a Page a Page. The call to ClearContents is an attempt at telling the Page to back off. It's easier to just not have a page.
  • They're "thrashing" a bit in the Page_Load, basically "programming by coincidence."
    • They're not 100% sure of what needs to be done, so they're doing everything until it works. Even setting defaults many times or calling methods that set properties and they setting those properties again.
    • This means a few things. First, HTTP is subtle. Second, the Response APIs are confusing (less so in .NET 4) and it's easy to do the same thing in two ways.
  • They're not using using() or IDisposable
    • They're cleaning up MemoryStreams and StreamWriters, but if an exception happens, things'll get cleaned up whenever. It's not tight in a "cleaning up after yourself" deterministic (when it needs to be) kind of way.
    • They're calling Flush() when it's not really needed, again programming by coincidence. "I think this needs to be done and it doesn't break..."
  • Old school data access
    • Not bad, pre se, but it could be easier to write and easier to read. DataAdapters, DataSets, are a hard way to do data access once you've used Linq to SQL, EF or NHibernate.
  • Re-Throwing an Exception via "throw ex"
    • When you want to re-throw an exception, ALWAYS just throw; or you'll lose your current call stack and it'll be hard to debug your code.
  • Not reusable at all
    • Now, reuse isn't the end-all, but it's nice. If this programmer wants different kinds of exports, they'll need to extract a lot from the spaghetti.

Here's what they came up with first.

Note that this code is not ideal. This is the before. Don't use it. 

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
using System.Configuration;
using System.Data.SqlClient;
using System.Data;

namespace FooFoo
{
public partial class Download : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
System.IO.MemoryStream ms = CreateMemoryFile();

byte[] byteArray = ms.ToArray();
ms.Flush();
ms.Close();

Response.Clear();
Response.ClearContent();
Response.ClearHeaders();
Response.Cookies.Clear();
Response.Cache.SetCacheability(HttpCacheability.Private);
Response.CacheControl = "private";
Response.Charset = System.Text.UTF8Encoding.UTF8.WebName;
Response.ContentEncoding = System.Text.UTF8Encoding.UTF8;
Response.AppendHeader("Pragma", "cache");
Response.AppendHeader("Expires", "60");
Response.ContentType = "text/comma-separated-values";
Response.AddHeader("Content-Disposition", "attachment; filename=FooFoo.csv");
Response.AddHeader("Content-Length", byteArray.Length.ToString());
Response.BinaryWrite(byteArray);
}

public System.IO.MemoryStream CreateMemoryFile()
{
MemoryStream ReturnStream = new MemoryStream();

try
{
string strConn = ConfigurationManager.ConnectionStrings["FooFooConnectionString"].ToString();
SqlConnection conn = new SqlConnection(strConn);
SqlDataAdapter da = new SqlDataAdapter("SELECT * FROM [FooFoo] ORDER BY id ASC", conn);
DataSet ds = new DataSet();
da.Fill(ds, "FooFoo");
DataTable dt = ds.Tables["FooFoo"];

//Create a streamwriter to write to the memory stream
StreamWriter sw = new StreamWriter(ReturnStream);

int iColCount = dt.Columns.Count;

for (int i = 0; i < iColCount; i++)
{
sw.Write(dt.Columns[i]);
if (i < iColCount - 1)
{
sw.Write(",");
}
}

sw.WriteLine();
int intRows = dt.Rows.Count;

// Now write all the rows.
foreach (DataRow dr in dt.Rows)
{
for (int i = 0; i < iColCount; i++)
{

if (!Convert.IsDBNull(dr[i]))
{
string str = String.Format("\"{0:c}\"", dr[i].ToString()).Replace("\r\n", " ");
sw.Write(str);
}
else
{
sw.Write("");
}

if (i < iColCount - 1)
{
sw.Write(",");
}
}
sw.WriteLine();
}

sw.Flush();
sw.Close();
}
catch (Exception Ex)
{
throw Ex;
}
return ReturnStream;
}

}
}

I don't claim to be a good programmer, but I do OK. I went over my concerns with my friend, and suggested first an HttpHandler. I started with Phil's basic abstract HttpHandler (based on my super basic HttpHandler boilerplate). I could have certainly done by just implementing IHttpHandler, but I like this way. They're about the same # of lines of code. The important part is in HandleRequest (or ProcessRequest).

namespace FooFoo
{
public class ExportHandler : BaseHttpHandler
{
public override void HandleRequest(HttpContext context)
{
context.Response.AddHeader("Content-Disposition", "attachment; filename=FooFoo.csv");
context.Response.Cache.SetCacheability(HttpCacheability.Private);

var dc = new FooFooDataContext();
string result = dc.FooFooTable.ToCsv();
context.Response.Write(result);
}

public override bool RequiresAuthentication
{
get { return true; }
}

public override string ContentMimeType
{
get { return "text/comma-separated-values"; }
}

public override bool ValidateParameters(HttpContext context)
{
return true;
}
}
}

At the time I wrote this, I was writing how I wished the code would look. I didn't have a "ToCsv()" method, but I was showing my friend how I though the could should be separated. Even better if there was a clean DAL (Data Access Layer) and Business Layer along with a service for turning things into CSV, but this isn't too bad. ToCsv() in this example is a theoretical extension method to take an IEnumerable of something and output it as CSV. I started writing it, but then decided to Google with Bing, and found a decent direction to start with at Mike Hadlow's blog. He didn't include all the code in the post, but it saved me some typing, so thanks Mike!

namespace FooFoo
{
public static class LinqToCSV
{
public static string ToCsv<T>(this IEnumerable<T> items)
where T : class
{
var csvBuilder = new StringBuilder();
var properties = typeof(T).GetProperties();
foreach (T item in items)
{
string line = string.Join(",",properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray());
csvBuilder.AppendLine(line);
}
return csvBuilder.ToString();
}

private static string ToCsvValue<T>(this T item)
{
if(item == null) return "\"\"";

if (item is string)
{
return string.Format("\"{0}\"", item.ToString().Replace("\"", "\\\""));
}
double dummy;
if (double.TryParse(item.ToString(), out dummy))
{
return string.Format("{0}", item);
}
return string.Format("\"{0}\"", item);
}
}
}

This creates an extension method that lets me call something.toCsv() on anything IEnumerable. It'll spin through the properties (yes, I know that could have been a LINQ statement also, but I like a nice ForEach sometimes. Feel free to fix this up in the comments! ;) ) and build up the Comma Separated Values.

At some point, it really should format Dates as {0:u} but as of now, it works identically as the before code and attempts to rectify most of the issues brought up. Of course, one could take something like this as far and make it as robust as they like.

As Mike points out, you can also do little projections to control the output:

string csv = things.Select(t => new { Id = t.Id, Name = t.Name, Date = t.Date, Child = t.Child.Name }).AsCsv();

Your thoughts, Dear Reader?

Related Links

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Hanselminutes Podcast 198 - Reactive Extensions for .NET (Rx) with Erik Meijer

February 4, '10 Comments [1] Posted in ASP.NET | Podcast | Programming | Silverlight | VS2010 | Windows Client | WPF
Sponsored By

Reactive Extensions for .NET LogoMy one-hundred-and-ninety-eighth podcast is up. I sit down with Erik Meijer from the Cloud Programmability Team to hear about the Reactive Extensions for .NET (Rx). Rx is a library for composing asynchronous and event-based programs using observable collections. Sound boring? Not even a little. Rx is a prescription for awesome.

Subscribe: Subscribe to Hanselminutes Subscribe to my Podcast in iTunes

Download: MP3 Full Show

Links from the Show

Do also remember the complete archives are always up and they have PDF Transcripts, a little known feature that show up a few weeks after each show.

Telerik is our sponsor for this show.

Check out their UI Suite of controls for ASP.NET. It's very hardcore stuff. One of the things I appreciate aboutTelerik is their commitment tocompleteness. For example, they have a page about their Right-to-Left support while some vendors have zero support, or don't bother testing. They also are committed to XHTML compliance and publish their roadmap. It's nice when your controls vendor is very transparent.

As I've said before this show comes to you with the audio expertise and stewardship of Carl Franklin. The name comes from Travis Illig, but the goal of the show is simple. Avoid wasting the listener's time. (and make the commute less boring)

Enjoy. Who knows what'll happen in the next show?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Hanselminutes Podcast 197 - The Dynamic Language Runtime, IronRuby and IronPython with Jimmy Schementi

February 4, '10 Comments [1] Posted in DLR | Podcast | Python | Ruby
Sponsored By

image My one-hundred-and-ninety-seventh podcast is up. Scott sits down with Jimmy Schementi to find out what's the scoop with the DLR. Is it baked? What do I need to do to get started? What's the status of IronRuby - is it done? Will IronPython be a first class language or is it already? All these questions and more will be answered.

Subscribe: Subscribe to Hanselminutes Subscribe to my Podcast in iTunes

Download: MP3 Full Show

Links from the Show

Do also remember the complete archives are always up and they have PDF Transcripts, a little known feature that show up a few weeks after each show.

Telerik is our sponsor for this show.

Check out their UI Suite of controls for ASP.NET. It's very hardcore stuff. One of the things I appreciate aboutTelerik is their commitment tocompleteness. For example, they have a page about their Right-to-Left support while some vendors have zero support, or don't bother testing. They also are committed to XHTML compliance and publish their roadmap. It's nice when your controls vendor is very transparent.

As I've said before this show comes to you with the audio expertise and stewardship of Carl Franklin. The name comes from Travis Illig, but the goal of the show is simple. Avoid wasting the listener's time. (and make the commute less boring)

Enjoy. Who knows what'll happen in the next show?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Hanselminutes Podcast 196 - .NET 4 CLR, Framework and Language Chat with Jason Olson

February 4, '10 Comments [1] Posted in ASP.NET | BCL | DLR | Learning .NET | Podcast
Sponsored By

image My one-hundred-and-ninety-sixth podcast is up. Jason Olson works (or worked, as you'll hear) for Microsoft in DPE. In this episode he takes Scott a little deeper into some of the new features in .NET 4, including security, CLR changes, C# 4 and VB 10 improvements and the new Task Parallel Library.

Subscribe: Subscribe to Hanselminutes Subscribe to my Podcast in iTunes

Download: MP3 Full Show

Links from the Show

Do also remember the complete archives are always up and they have PDF Transcripts, a little known feature that show up a few weeks after each show.

Telerik is our sponsor for this show.

Check out their UI Suite of controls for ASP.NET. It's very hardcore stuff. One of the things I appreciate aboutTelerik is their commitment tocompleteness. For example, they have a page about their Right-to-Left support while some vendors have zero support, or don't bother testing. They also are committed to XHTML compliance and publish their roadmap. It's nice when your controls vendor is very transparent.

As I've said before this show comes to you with the audio expertise and stewardship of Carl Franklin. The name comes from Travis Illig, but the goal of the show is simple. Avoid wasting the listener's time. (and make the commute less boring)

Enjoy. Who knows what'll happen in the next show?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Hanselminutes Podcast 195 - Open Source, Microsoft and The WiX Project with Rob Mensching

February 4, '10 Comments [1] Posted in Deployment | Open Source | Podcast
Sponsored By

WiX Project Logo My one-hundred-and-ninety-fifth podcast is up. The WiX Project was the first big Open Source project out of Microsoft over 10 years ago! Scott talks to project lead Rob Mensching about how the WiX Installer project got started. How much trouble did he have with Microsoft bosses and Legal? What's next for WiX?

Warning, Rob talks fast, so you want want to slow this one down a smidge. ;)

Subscribe: Subscribe to Hanselminutes Subscribe to my Podcast in iTunes

Download: MP3 Full Show

Links from the Show

Do also remember the complete archives are always up and they have PDF Transcripts, a little known feature that show up a few weeks after each show.

Telerik is our sponsor for this show.

Check out their UI Suite of controls for ASP.NET. It's very hardcore stuff. One of the things I appreciate aboutTelerik is their commitment tocompleteness. For example, they have a page about their Right-to-Left support while some vendors have zero support, or don't bother testing. They also are committed to XHTML compliance and publish their roadmap. It's nice when your controls vendor is very transparent.

As I've said before this show comes to you with the audio expertise and stewardship of Carl Franklin. The name comes from Travis Illig, but the goal of the show is simple. Avoid wasting the listener's time. (and make the commute less boring)

Enjoy. Who knows what'll happen in the next show?

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by SherWeb

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.