Scott Hanselman

Introducing Windows Azure WebJobs

January 23, 2014 Comment on this post [42] Posted in Azure
Sponsored By

I'm currently running 16 web sites on Windows Azure. I have a few Virtual Machines, but I prefer to run things using "platform as a service" where I don't have to sweat the underlying Virtual Machine. That means, while I know I can run a Virtual Machine and put "cron" jobs on it, I'm less likely to because I don't want to mess with VMs or Worker Roles.

There are a few ways to run stuff on Azure, first, there's IAAS (Infrastructure as a Service) which is VMs. Then there's Cloud Applications (Cloud Services) where you can run anything in an Azure-managed VM. It's still a VM, but you have a lot of choice and can run Worker Roles and background stuff. However, there's a lot of ceremony if you just want to run your small "job" either on a regular basis or via a trigger.

Azure Explained in one Image

Looking at this differently, platform as a service is like having your hotel room fixed up daily, while VMs is more like managing a house yourself.

Azure Explained in one Image

 

As someone who likes to torch a hotel room as much as the next person, this is why I like Azure Web Sites (PAAS). You just deploy, and it's done. The VM is invisible and the site is always up.

However, there's not yet been a good solution under web sites for doing regular jobs and batch work in the background. Now Azure Web Sites support a thing  called "Azure WebJobs" to solve this problem simply.

Scaling a Command Line application with Azure WebJobs

When I want to do something simple - like resize some images - I'll either write a script or a small .NET application. Things do get complex though when you want to take something simple and do it n times. Scaling a command line app to the cloud often involves a lot of yak shaving.

Let's say I want to take this function that works fine at the command line and run it in the cloud at scale.

public static void SquishNewlyUploadedPNGs(Stream input, Stream output)
{
var quantizer = new WuQuantizer();
using (var bitmap = new Bitmap(input))
{
using (var quantized = quantizer.QuantizeImage(bitmap))
{
quantized.Save(output, ImageFormat.Png);
}
}
}

WebJobs aims to make developing, running, and scaling this easier. They are built into Azure Websites and run in the same VM as your Web Sites.

Here's some typical scenarios that would be great for the Windows Azure WebJobs SDK:

  • Image processing or other CPU-intensive work.
  • Queue processing.
  • RSS aggregation.
  • File maintenance, such as aggregating or cleaning up log files. 
  • Other long-running tasks that you want to run in a background thread, such as sending emails.

WebJobs are invoked in two different ways, either they are triggered or they are continuously running. Triggered jobs happen on a schedule or when some event happens and Continuous jobs basically run a while loop.

WebJobs are deployed by copying them to the right place in the file-system (or using a designated API which will do the same). The following file types are accepted as runnable scripts that can be used as a job:

  • .exe - .NET assemblies compiled with the WebJobs SDK
  • .cmd, .bat, .exe (using windows cmd)
  • .sh (using bash)
  • .php (using php)
  • .py (using python)
  • .js (using node)

After you deploy your WebJobs from the portal, you can start and stop jobs, delete them, upload jobs as ZIP files, etc. You've got full control.

A good thing to point out, though, is that Azure WebJobs are more than just scheduled scripts, you can also create WebJobs as .NET projects written in C# or whatever.

Making a WebJob out of a command line app with the Windows Azure WebJobs SDK

WebJobs can effectively take some command line C# application with a function and turn it into a scalable WebJob. I spoke about this over the last few years in presentations when it was codenamed "SimpleBatch." This lets you write a simple console app to, say, resize an image, then move it up to the cloud and resize millions. Jobs can be triggered by the appearance of new items on an Azure Queue, or by new binary Blobs showing up in Azure Storage.

NOTE: You don't have to use the WebJobs SDK with the WebJobs feature of Windows Azure Web Sites. As noted earlier, the WebJobs feature enables you to upload and run any executable or script, whether or not it uses the WebJobs SDK framework.

I wanted to make a Web Job that would losslessly squish PNGs as I upload them to Azure storage. When new PNGs show up, the job should automatically run on these new PNGs. This is easy as a Command Line app using the nQuant open source library as in the code above.

Now I'll add the WebJobs SDK NuGet package (it's prerelease) and Microsoft.WindowsAzure.Jobs namespace, then add [BlobInput] and [BlobOutput] attributes, then start the JobHost() from Main. That's it.

using Microsoft.WindowsAzure.Jobs;
using nQuant;
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
JobHost host = new JobHost();
host.RunAndBlock();
}

public static void SquishNewlyUploadedPNGs(
[BlobInput("input/{name}")] Stream input,
[BlobOutput("output/{name}")] Stream output)
{
var quantizer = new WuQuantizer();
using (var bitmap = new Bitmap(input))
{
using (var quantized = quantizer.QuantizeImage(bitmap))
{
quantized.Save(output, ImageFormat.Png);
}
}

}
}
}

CONTEXT: Let's just step back and process this for a second. All I had to do was spin up the JobHost and set up a few attributes. Minimal ceremony for maximum results. My console app is now processing information from Azure blob storage without ever referencing the Azure Blob Storage API!

The function is automatically called when a new blob (in my case, a new PNG) shows up in the input container in storage and the Stream parameters are automatically
"bound" (like Model Binding) for me by the WebJobs SDK.

To deploy, I zip up my app and upload it from the WebJobs section of my existing Azure Website in the Portal.

image

Here it is in the Portal.

image

I'm setting mine to continuous, but it can also run on a detailed schedule:

12schdmonthsonpartweekdaysoccurences

I need my WebJob to be told about what Azure Storage account it's going to use, so from my Azure Web Site under the Connection Strings section I set up two strings, one for the AzureJobsRuntime (for logging) and one for AzureJobsData (what I'm accessing). 

image

For what I'm doing they are the same. The connection strings look like this:

DefaultEndpointsProtocol=https;AccountName=hanselstorage;AccountKey=3exLzmagickey

The key here came from Manage Access Keys in my storage account, here:

image

In my "Hanselstorage" Storage Container I made two areas, input and output. You can name yours whatever. You can also process from Queues, etc.

image

Now, going back to the code, look at the parameters to the Attributes I'm using:

public static void SquishNewlyUploadedPNGs(           
[BlobInput("input/{name}")] Stream input,
[BlobOutput("output/{name}")] Stream output)

There's the strings "input" and "output" pointing to specific containers in my Storage account. Again, the actual storage account (Hanselstorage) is part of the connection string. That lets you reuse WebJobs in multiple sites, just by changing the connection strings.

There is a link to get to the Azure Web Jobs Dashboard to the right of your job, but the format for the URL to access is this: https://YOURSITE.scm.azurewebsites.net/azurejobs. You'll need to enter your same credentials you've used for Azure deployment.

Once you've uploaded your job, you'll see the registered function(s) here:

image

I've installed the Azure SDK and can access my storage live within Visual Studio. You can also try 3rd party apps like Cloudberry Explorer. Here I've uploaded a file called scottha.png into the input container.

image

After a few minutes the SDK will process the new blob (Queues are faster, but blobs are checked every 10 minutes), the job will run and either succeed or fail. If your app throws an exception you'll actually see it in the Invocation Details part of the site.

image

Here's a successful one. You can see it worked (it squished) because of the number of input bytes and the number of output bytes.

image

You can see the full output of what happens in a WebJob within this Dashboard, or check the log files directly via FTP. For me, I can explore my output container in Azure Storage and download or use the now-squished images. Again, this can be used for any large job whether it be processing images, OCR, log file analysis, SQL server cleanup, whatever you can think of.

Azure WebJobs is in preview, so there will be bugs, changing documentation and updates to the SDK but the general idea is there and it's solid. I think you'll dig it.

Related Links


Sponsor: Big thanks to combit for sponsoring the blog feed this week! Enjoy feature-rich report designing: Discover the reporting tool of choice for thousands of developers. List & Label is an award-winning component with a royalty-free report designer. Free trial!

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Hosting By
Hosted in an Azure App Service
January 23, 2014 5:11
Azure Web Sites run Python (2.7), as well. And PTVS can deploy straight to them.
January 23, 2014 5:28
This certainly fills an immediate need for me...will allow me to pre aggregate some reporting numbers and various other things specific to certain sites. beats task scheduler on a vm, which is what I was already planning on doing. Actually more excited for the cost savings than the convenience, but i'll take em both :)

A couple of questions. how does this work with auto scaling? If i'm running on two instances will my job only run on one of those instances or will i need to plan for my jobs to scale with the sites. Also, If my job is cpu intensive and i'm using cpu auto scale, will my job trigger the scale out?
January 23, 2014 5:44
Chris, for continuous webjobs - they will run on all of your instances.
for triggered - only a single instance of a run can run at a given time so when triggered the webjob will run on one of your instances (randomly).
January 23, 2014 5:59
So powershell isn't an option? Seems odd that even shell scripts would run but not PS.
January 23, 2014 6:14
I am curious why PowerShell is not supported as native scripts. There are simple workarounds, but it is just strange seeing BASH and .js scripts have native support but .ps1 scripts do not.

Keep up the great work!
January 23, 2014 6:51
I've worked around this problem using a worker role, continuously looping.

This seems much better ... but ... the deployment process is terrible. I need to upload a zip every time I want to release? Yuck!

This would be perfect if there was some way to trigger deployments from Git, or at least through Visual Studio.
January 23, 2014 6:52
@Kevin There is a bug tracking the powershell support https://github.com/projectkudu/kudu/issues/985
January 23, 2014 6:55
@Paul, you can deploy using Kudu. Eventually there will be support for deploying through Visual Studio as well which will make it easier to publish a triggered or continuous WebJob
January 23, 2014 7:34
@Pranav That's good news.

Can you elaborate on 'you can deploy using Kudu'? I can see a commit where WebJobs support was added to Kudu (https://github.com/projectkudu/kudu/commit/59561993c49d5935ce91ca062d800a6e57b1ad5f) but am not sure how to use it.
January 23, 2014 8:09
@Paul Please read this post which explains how WebJobs are stored on the server so you can use that folder structure to pushing using other deployment mechanism http://blog.amitapple.com/post/74215124623/deploy-azure-webjobs
January 23, 2014 9:43
Nice article.. Something that I was looking for.
Connection string shows the Account Key!!
pag
January 23, 2014 10:46
Scott,

Excellent article. I have been job based processing for one my big project in asp.net timer web site using timers & threads.
But that was not a convenient & traceable solution.

With the invent of this tool, it can help a lot to design a queue based & job based processing script.

Gr8.

Thanks for sharing.
January 23, 2014 10:58
This sounds like a useful service. Recently I've just achieved something similar with just Web Sites and Scheduler, but I'll definitely try this as well.
Just a question, there is no info about pricing, it will be part of Web Sites pricing model or something new?
January 23, 2014 12:31
That sounds great, will have to check that out, since I am already running headless VMs.

Two questions though:
1) how will this scale? Does is start multiple instances and balance the work?
2) will there be VCS access like for websites, so one can push updates?

Btw, do you know why a website can't listen to all request? I have seen that requests with a different 'Host' header get blocked and never reach the website :-/ this is sadly the reason why I have to use CloudServices for stuff, where I don't know all domains in advance.
January 23, 2014 12:41
@Martin I'd assume it would be included in the price of the Instance, we don't pay for the number of websites so I'd take it that it would be the same for web jobs. Just a speculation mind.
January 23, 2014 13:59
While I can run PowerShell code through a .cmd file, I would be happy to see .ps1 inside the following list: .cmd, .bat, .exe.
January 23, 2014 18:40
Just to let you know you've published your account key as part of connection string above, then blanked it out in the image...
January 23, 2014 18:58
Hello Scott, thanks for this nice post!
Is it possible to access an Azure SQL Database from a webjob?
In my website I need to aggregate some data (read the purchase orders table and fill in the stats table)every 24 hours: do you think a webjob could be appropriate or would it be better to use the Scheduler service?

Thanks
January 23, 2014 22:49
Some answers:


* No need to pay extra for this feature, note that for continuous WebJobs there is an important feature called "always on" which is only for Standard Website, this will make sure your Website and WebJob are always up (won't be the case for free/shared websites), so you can experiment using free but for full usage you need standard.

* You can access from a WebJob the same things as from a Website, if you use .NET application you can even access your app settings and connection strings as in your Website (for other types you can use environment variables to access those).

Timm - Regarding scale see my previous comment regarding your issue, you can start a thread on the windows azure forum.
January 23, 2014 22:51
@Valerio - A WebJob can use the scheduler service, just create a new one and select a schedule for it.
January 23, 2014 23:24
Still no answer to "WHY!" PowerShell is not included in WebJob. This automation technology has mature long enough and should not to be ignore nowadays.
January 23, 2014 23:42
Spend any time dealing with Worker Roles for basic repeatable tasks and you quickly realize how much of a big deal this update is!
January 24, 2014 0:29
Max - Because this is v0.1. PowerShell is on the list.
January 24, 2014 0:41
Are there plans to let me publish the by pushing to BitBucket/GitHub/CodePlex? Other than that, this is perfect, I was just talking with some coworkers about how I wish this existed.
January 24, 2014 1:51
This is so cool, I will be using on my app as soon as it is practically possible.
January 24, 2014 2:42
@Job - Yes, look at the following posts:
a@http://blog.amitapple.com/post/74215124623/deploy-azure-webjobs
a@http://blog.amitapple.com/post/73574681678/git-deploy-console-app
January 24, 2014 7:40
I would love this to be available in some form in on-premise ASP.NET MVC/IIS. Yes, I know I can write a service, but that means additional deployment complexity. Yes, I know I can spawn a background thread and register via IRegisteredObject to be aware of when the App Pool is being recycled, but that's a hack. I know I can go down the WAS path with MSMQ, but that's a configuration nightmare.

I just want to reliably run some background activity in my on-premise ASP.NET Web App without complicating my deployment beyond Web Deploy or relying upon hacks that are susceptible to app pool recycling.
January 24, 2014 13:01
Great news. As soon as websites is available in XS I'm selling the house and destroying hotel rooms :-)
January 24, 2014 19:35
This feature seems a little weird to me. Why would you want the scalability of your background "Jobs" to be tied inextricably to the scalability of your Web Site? Chances are one has nothing to do with the other.

This feature should have just been called "Jobs" and not be tied at all to Web Sites. This could have just been an update to Worker Roles instead of introducing a whole new redundant thing.
January 24, 2014 20:26
What are the 16 sites?
January 24, 2014 20:43
How do WebJobs compare and contrast with Worker Roles?
January 25, 2014 0:25

This is great Scott as I currently use a cron job to periodically call a web api to dump a work request on a ServiceBus Q which is then picked up by the Worker Role to 'do stuff'. Azure really is a great toolkit and it keeps getting better!
January 26, 2014 5:28
@Paul, we have just released some preview support in Visual Studio for publishing webjobs via a web project. To download and learn more visit http://visualstudiogallery.msdn.microsoft.com/f4824551-2660-4afa-aba1-1fcc1673c3d0. If you try it out please let us know what you think.


Sayed Ibrahim Hashimi | @SayedIHashimi
January 26, 2014 7:23
Scott,
Microsoft.WindowsAzure.Jobs.Host is another Nuget package that is needed.
February 07, 2014 23:51
Awesome new service and great article! I had to laugh when I read "freakin' Erlang" in the first chart as I will replace the only Erlang application I use with this new service: RabbitMQ :)
March 07, 2014 1:51
@Sayed, used WebJobVs and it worked first time, saved me from figuring out how to create appropriate folder structure and I was able to deploy a WebSite+Webjob directly from GitHub with a simple update to an existing project.

As stated already, keep-up the great work, Webjobs are a great and simple add-on for long running/triggered tasks that compliment WebSites, without the overhead of a defining/managing a full Worker Role.
March 12, 2014 9:48
Hi,
I'm trying to use webjobs to hit a specific url on my associated azure website per instance. Someone mentioned above that Continuous web jobs will fire on every instance you have if you scaled out. On each instance could I have the web job somehow (via webclient?) hit a url on that instance only?
Regards,
Matt
March 13, 2014 14:13
I've got a WebJob running that I only ever want one instance of. If the website autoscales I still only want the one WebJob instance. My WebJob is a simple .Net console application that runs in a loop picking of messages from an Azure service bus to process. If two WebJobs instances were to run a message would be processed twice.

Any quick way to detect that "another" instance is running and simply just sit there do nothing.

Having the option to select "single instance only" on WebJobs would be great.
March 14, 2014 4:54
Continuous WebJobs support "singleton" mode which will make sure your continuous WebJob only runs on a single instance, see https://github.com/projectkudu/kudu/wiki/Web-jobs for details on how to enable it.
March 25, 2014 5:35
A person necessarily lend a hand to make seriously posts I might state.
This is the very first time I frequented your website page
and to this point? I surprised with the research you made to create this particular post extraordinary.
Great job!
March 27, 2014 3:50
I was happy about this feature and immediately experimented with it a few days ago. Up until late this afternoon, my .exe webjob was working fine. Then it started to fail at every invocation. The error message is:
Status changed to Running
[03/26/2014 23:41:58 > 8fa263: ERR ] 'IncidentPoll.exe' is not recognized as an internal or external command,
[03/26/2014 23:41:58 > 8fa263: ERR ] operable program or batch file.
Any ideas?
April 16, 2014 13:25
@James, take a look to HangFire. It is like WebJobs, but for on-premise ASP.NET/IIS applications, backed by SQL Server or Redis. It does not require external Windows Service/Windows Scheduler to be reliable and knows everything about fact that ASP.NET application can be recycled at any time. And it is simple enough and open-source.

Moreover, if someone implement support for Azure Queue and Table Storage in HangFire, there will be much simpler alternative to WebJobs for Windows Azure.

Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.