Deep Linking in JavaScript and Ajax Applications

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:36 am

Deep Linking in JavaScript and Ajax ApplicationsLast week I posted a tutorial that demonstrated using a simple application how to implement progressive enhancement into your Ajax projects. The one major flaw in the final Ajax-driven page from that tutorial is the lack of deep linking when JavaScript is enabled.

Although the resulting code is clean, works well, and is easy to maintain, the lack of deep linking is enough to cause a client to balk at the use of progressive enhancement in such a circumstance. So, in this brief tutorial, I’ll describe how to incorporate deep linking into that page.

If you haven’t already gone through the previous tutorial, doing so might help you get up to speed — but it’s not absolutely necessary, since the code we’ll be using is pretty straightforward.

Step 1: Review the Primary Function From the Previous Tutorial

The final JavaScript code from the previous tutorial included a function called getEmployeeInfo, which is the primary piece of code we’ll be working with in this tutorial. Here is that function:

function getEmployeeInfo() {

	var myLinksCollection = document.getElementsByTagName("a");

	for (i=0;i<myLinksCollection.length;i++) {
		myLinksCollection[i].onclick = function() {
			if (this.href.indexOf("view=") !== -1) {
				var clickedHREF = this.href;
				var clickedView = clickedHREF.split("view=");
				ajaxInitiate(clickedView[1]+'.html');
				return false;
			}
		}
	}

}

We’ll be adding a few lines of code to the above function in order to implement deep linking into this application.

Step 2: Add the Hash Character to the URL

In order to deep link to a particular state in the application, we want to utilize the hash character (#), using it to append an identifier to the URL of the page. JavaScript lets us access the current hash using the following line of code:

location.hash = "identifier";

That line can be added anywhere inside the anonymous function, and that will add a hash to the current URL, followed by the word “identifier”, without reloading the page. If a hash already exists, the text following the hash will be changed to “identifier”. The hash character makes the link internal, preventing a page refresh, the same way this would occur in an href value of an anchor tag. Of course, we don’t want to just put any old text in there, but place some text that will help us identify the state of the Ajax application.

In the anonymous function, the current state is identified by the clickedView array, so we can use that to append an identifier to the current URL, like this:

location.hash = clickedView[1];

Let’s take a look at how the page works after adding that line to the code:

Demo Page #1

Now the content is loading correctly, and each time a link is clicked, the value following the hash character in the URL is changed, helping us identify what state the application is in.

Step 3: Change Content According to the Current State

Now that the application is appending a state identifier to the current URL, we need to check to see if that identifier exists, and display the correct information. Here’s the code that will accomplish this:

if (location.hash) {
		ajaxInitiate(location.hash.replace("#","")+'.html');
	}

It’s pretty simple: We check to see if the “hash” character exists. If it does, then we call the ajaxInitiate function with the hash text as the file name that’s passed as an argument. Since the location.hash property includes the actual hash symbol (#), we’re using the replace method to strip out the hash and get a clean file name.

The above code is added before the loop in the getEmployeeInfo function, so the complete function now looks like this:

function getEmployeeInfo() {

	if (location.hash) {
		ajaxInitiate(location.hash.replace("#","")+'.html');
	}

	var myLinksCollection = document.getElementsByTagName("a");

	for (i=0;i<myLinksCollection.length;i++) {
		myLinksCollection[i].onclick = function() {
			if (this.href.indexOf("view=") !== -1) {
				var clickedHREF = this.href;
				var clickedView = clickedHREF.split("view=");
				ajaxInitiate(clickedView[1]+'.html');
				location.hash = clickedView[1];
				return false;
			}
		}
	}

}

Drawbacks to this Method

There are two drawbacks to this method. One small drawback is that the content does not change if the URL is manually altered after the initial page view. The hash character will only affect the page’s content on the first page view, when the entire page is initially loaded. But that’s not a big deal, since the content can easily be changed by clicking any of the links after the initial page view. The only time we’re really concerned about the state of the hash tag is on the initial page load, so it works fine in that respect.

The other drawback is that this method does not preserve the back button. You’ll notice that if you use the back button to try to visit a previous state, the URL will change, but the content is not affected. This happens because the URL change is only internal, and does not initiate a full page refresh. Preserving the back button in Ajax applications is a more complicated process that is beyond the scope of this simple tutorial.

Conclusion

And that’s it, our application now has deep linking, and works in virtually the same way as the PHP-only version, aside from the drawbacks mentioned above.

Use the links below to view the final demo page, or download the complete code, including the code from the previous tutorial. The link to the demo page includes a hash identifier, so you can see how the code works.

Related posts:

  1. Building an Ajax Application with Progressive Enhancement
  2. Ajax From the Ground Up: Part 1 – XMLHttpRequest
  3. Ajax From the Ground Up: Part 2 – Sending Data to the Server



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

Response to “In Defense of Vertical Navigation”

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:36 am

Response to In Defense of Vertical NavigationIt’s been quite a start to this week since the publication of my article on Smashing Magazine called The Case Against Vertical Navigation. I really didn’t expect this type of response. I assumed that what I was stating was a fairly commonly held view among designers.

Since there have been a lot of criticisms of Smashing Magazine over the past year (mainly because of endless “list” posts), Vitaly Friedman was more than happy to publish an opinion piece on a specific aspect of design. So, if you haven’t read the original article yet, please do. And please read Kyle Meyer’s response to my article, which I will be responding to here.

I’m glad Kyle posted his response; as Jacob Gube mentioned in both SM’s comments and on Kyle’s site, this type of discussion is good, regardless of who is right and wrong.

My main beef with Kyle’s response to my article is that I don’t think he quite understands that my original article agrees with him on many points. I think a lot of people got a little sensitive over the title of the article. But, let’s be honest here. If I had called the article “Pros and Cons of Vertical Navigation” (and approached the theme from that super-objective angle), it would not have elicited the response it did. Let’s now consider what Kyle had to say.

Item One: It Discourages Information Architecture

He begins by responding to my heading “It Discourages Bad Information Architecture”, then displays a screen shot from Black Estate Vineyard. I love that site, it’s beautiful, and he’s absolutely right that the aesthetic would be lost with a horizontal navigation. But that doesn’t mean that the site is better architected, or properly focused on content.

What’s interesting is that this site has only one page. Yes, it has a left-hand navigation menu, but those links are internal page anchors that trigger an animated scroll to different parts of the page. I think a better case study would be a site that has more depth, content-wise.

Also, Kyle is fully aware that I posted 6 “exceptions” to my “rule”, one of which was “a minimalist design”, which means this site would fully qualify as an exception since I personally consider this to be a minimalist design. So, does my original article disagree with this site’s use of vertical navigation? Absolutely not.

Item Two: It Wastes Prime Screen Real Estate

In the next section he discusses my heading “It Wastes Prime Screen Real Estate”. His argument here is weak and is not very well thought out. To quote him:

It would be ignorant to assume that just because navigation is placed to the left or right of a site in a vertical fashion that no content could fall below it and continue to make use of the space as you scroll down the page. Consider your typical blog template: content in a main column, navigation in a sidebar.

Okay, sounds fine so far. But then he says:

More importantly, beneath the navigation, typically lies other sub-content that is less important to the user as they scroll the page looking at the main post content. With a horizontal model, this content would have to be placed above the main content, forcing it lower on the page. Below the main content, forcing it far down the page and demoting it to the footer’s territory, and the lack of importance that may come with this. Finally, perhaps inside the main content, creating a distraction from the title to content to post meta data flow that users are so accustomed to.

I’m having trouble confirming exactly what his point is, but it doesn’t sound like a very good argument. He’s saying that with a horizontal model, the content that would normally be under a vertical sidebar would be forced either above the content, below, near the footer, or else somewhere in between, causing a distraction. I disagree completely, and think he’s using a very rare circumstance to provide a weak counter argument. I’m going to quote directly from my original article to counter this point:

In paragraph 3 of my introduction, I said:

It should be noted here that when I refer to “vertical navigation”, I’m talking about the top-level, primary means of navigating a website. This would not include left or right sidebars that have secondary links and call-to-action areas that are perfectly acceptable in many circumstances.

So, what this all boils down to is that I agree with Kyle: There will be content in left and/or right sidebars, in addition to the primary content beside those bars, except this would all be found underneath the primary horizontal navigation. There is no need to move any of that content to the top or bottom.

I don’t think his argument in this regard is very strong at all, and I’m a little surprised he would even make it. Of course, what he said was a tad confusing (possibly because he rushed to publish it, which is fine), so maybe I’m misunderstanding his point.

He next discusses a screen shot from Shea, saying that it conforms well to the F-pattern hierarchy, which it does. But I don’t think he understands how much more can be done with the left side of the page. Yes, that site is beautiful, but would it be less beautiful if it was planned with a horizontal bar? It actually has a horizontal bar, which could easily hold the links in the navigation, and allow more important things (like sidebar calls-to-action) to be included on the left side instead. Of course, the design would have to be reconsidered, so I’m not saying here that a horizontal navigation would fit well with this layout and design. I’m just saying that when a site is planned, if the left side of the screen is viewed as prime screen real estate, it is best to avoid putting common page elements in that area.

And if you visit the sub-pages on that site, you’ll notice that all the space below the navigation is completely wasted, proving my point that prime real estate has been ignored, and countering his argument that vertical navigation still allows for placement of content below the navigation.

Also, although the Shea website is attractive, I get a dated feeling from it. When I look at that site, I don’t think “modern web design”; I think “old school”. But of course, that’s just my opinion, but I don’t think that site compares to most of the beautiful and modern-looking sites you see in CSS galleries nowadays.

Item Three: It Doesn’t Conform to Real-Life Reading

In the next section he discusses my heading “It Doesn’t Conform to Real-Life Reading”. This is one of the points that I really think got lost on almost everyone, as proven by the argument he attempts to make in rebuttal:

There are numerous examples elsewhere in the world where a person is asked to read something that resembles a vertical navigation. Restaurant menus, lists in books or emails. In fact, lists in general, which is what we use to semantically mark up navigation menus, no?

Maybe I didn’t make myself clear enough in the article, so I apologize for that. But my point on “real life reading” has nothing to do with things being stacked vertically, which I completely agree with as a common thing found in everyday reading. I’ll quote from my article to explain myself here:

There aren’t many areas in life where a person is asked to read something that has a “left hand menu” that resembles what we find on websites that feature vertical navigation. In general, people are accustomed to reading content that spans the entire width of the reading area, or else is broken up into boxes or columns within the reading area. In either case, the content is vertically sandwiched between a header and footer. Books and magazines are a good example of this.

My point here is not that “vertically stacked” menus are rare; it’s that content with a “vertically stacked menu” next to it is rare. Of course, I’m not expecting content to span the entire width of the page anyhow, but I think it’s more common to find varying content to the left or right (in real-life reading) as opposed to a vertical “menu”.

Item Five: It’s Not as Successful, According to Studies

The next section I will concede to his points, because, as many people in the comments pointed out, the studies are difficult to use in this context. I probably should have thought that through more, but I guess my point was this:

One study showed that horizontal navigation was more successful, and since people read content in an F-pattern, it’s best to use horizontal navigation and use more important elements to form the “F”. Also, I would like to know if there’s ever been a study that shows vertical navigation to be more successful than horizontal. Of course, he makes many good points about white space, and drawing the eyes, and the significance of looking at something longer, and whether that should even be viewed as a good thing.

Conclusion

In his conclusion, he states:

The point I would like to make is not that vertical navigation is in any way better than its horizontal counterpart. They both have pros and cons and are situationally more useful than each other. This is part of being a designer. Knowing the design patterns available to you and having the discretion to use the proper one at the proper time.

I agree wholeheartedly, which I think I stated indirectly in my article when I gave 6 possible exceptions to the “rule”.

I hope some of what I said in the original article was made clearer here, and I encourage everyone, as Kyle did, to continue considering reasons for your design choices, and not just following trends for the sake of trends.

No related posts.



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

Building an Ajax Application with Progressive Enhancement

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:36 am

Building an Ajax Application with Progressive EnhancementIf you’ve done your best to keep up with web development trends over the past five years or more, then it’s likely that you’re familiar with the concept of Progressive Enhancement. I’m not going to provide an explanation of that technique here, but instead, I thought I would demonstrate using a small Ajax-driven page how progressive enhancement can be implemented.

The mini-app we’ll be building in this tutorial is an employee information page. It will consist of a series of links at the top of the page that will determine what employee info is displayed in the content area. The information will be held inside of include files, to simplify the process (as opposed to a database or XML file which might be more practical in a real-world app). Although we’re going to use Ajax to display the information, we’re going to ensure that the same information is displayed even when the user is visiting the page without JavaScript capabilities.

View the Page We’ll Be Building

Step 1: Create the Primary PHP-driven Web Page

Since this tutorial is for the purpose of demonstrating progressive enhancement, the first thing we need to do is create the entire, fully-functional web page using a back-end technology. In this example, I’m using PHP, so a little bit of knowledge in that area would be helpful if you want to follow along with every step. From there, we’ll progressively enhance our mini-app with JavaScript and Ajax.

Here is the code for our main page:

<ul>
	<li><a href="?view=employee1">Employee 1</a></li>
	<li><a href="?view=employee2">Employee 2</a></li>
	<li><a href="?view=employee3">Employee 3</a></li>
	<li><a href="?view=employee4">Employee 4</a></li>
	<li><a href="?view=employee5">Employee 5</a></li>
</ul>

<div id="employee-info">
<?php include "employees.php"; ?>
</div>

Ideally, we would create the page so that the number of employees is flexible, but that would require some more complex PHP code. Since this tutorial is not about PHP, but about progressive enhancement, I’m simplifying things for demonstration purposes.

Three things to note about the above code:

  • Each link points to the same page with a query string value appended to identify which employee link is clicked
  • The content section is identified by a unique ID
  • The content section is populated by a PHP include

Step 2: Create the employee.php Include File

The employee.php include file is the engine that runs the page. We want the output of this file to change depending on what link is clicked. But we also want the employee.php file to be abstracted from the data itself, so we can update the data through a more practical means. In this example, as mentioned, I’m using separate include files for each employee. Most likely this information would be contained in a database or in an XML file, but I’m simplifying this to make things easier.

Let’s take a look at the code in our primary include file:

<?php
if (isset($_GET["view"])) {
	switch ($_GET["view"]) {
		case "employee1":
			include "employee1.html";
		break;
		case "employee2":
			include "employee2.html";
		break;
		case "employee3":
			include "employee3.html";
		break;
		case "employee4":
			include "employee4.html";
		break;
		case "employee5":
			include "employee5.html";
		break;
		default:
		?>
			<p>That employee doesn't exist.</p>
		<?php
		break;
		}
	} else {
	?>
	<p>Click a link to view information about an employee.</p>
<?php
}
?>

The above code is fairly straightforward, and shouldn’t require much explanation for even beginning PHP developers. The first line checks to see if the “view” query string variable has been set. If it has, then a switch statement is initialized that includes a different file depending on the value of the query string. There is also a default for the switch statement, just in case the query string is “hacked”. Finally, the last part of the code displays a generic message if the page is visited with no query string variable.

Populating the different employee include files is just a matter of creating them and putting in the necessary markup and data. Again, this is not necessarily the ideal method to store this information, and you certainly would not want to store sensitive information in this manner, but it works well for the purpose of this tutorial.

Each individual employee include file will look something like this:

<h2>Employee #1</h2>
<p>Information for employee #1 goes here.</p>

After setting up all five employee include files, our mini-app is now fully functional.

View the PHP-only Demo Page

Step 3: Intercept Mouse Clicks with JavaScript

Since our page is fully functional with conventional back-end methods, the content will be accessible to virtually all visitors, regardless of their platform or browser limitations. Now we can safely enhance the page with some JavaScript. The first thing we want to do is loop through the links in the navigation section and write some code that will intercept mouse clicks and thus prevent the page from being reloaded. Here’s the code:

var myLinksCollection = document.getElementsByTagName("a");

for (i=0;i<myLinksCollection.length;i++) {
	myLinksCollection[i].onclick = function() {
		return false;
	}
}

The first line of code creates an array holding all the anchor elements on the page. Then, the for loop iterates through the anchor elements and attaches an onclick event handler to each one. An anonymous function runs when a link is clicked, and the return false statement ensures that the location in the href attribute is not visited.

Also, to ensure the onclick handler is applied only to the links in the navigation section, we can add an if statement that looks for a particular string in the href value of each link of the page, like this:

for (i=0;i<myLinksCollection.length;i++) {
	myLinksCollection[i].onclick = function() {
		if (this.href.indexOf("view=") !== -1) {
			return false;
		}
	}
}

Now the onclick event will only be added to links that have the string “view=” in their href attribute. There are other ways to limit what is affected in our for loop, but this will suffice to accomplish what we want.

Step 4: Identify Which Link Was Clicked

The loop in the previous section doesn’t accomplish anything — other than intercepting the mouse click. Inside the loop, we’ll add two lines of code to identify which link is being clicked. Now the loop looks like this:

for (i=0;i<myLinksCollection.length;i++) {
	myLinksCollection[i].onclick = function() {
		if (this.href.indexOf("view=") !== -1) {
			var clickedHREF = this.href;
			var clickedView = clickedHREF.split("view=");
			return false;
		}
	}
}

We have two new lines of code. The first line uses the this keyword (which represents the current object in JavaScript; in this case it’s the anchor that’s been clicked) to identify the value of the href attribute. Remember that each href value has a unique query string that identifies the desired employee info. After that value is placed in a variable, the second line uses the split method to divide the href value by means of the “view=” string, thus creating an array of two elements called “clickedView”.

Let’s view a demo page of what we have so far, which also includes an alert statement that spits out the value that we’ve extracted from the href attribute:

View Demo that Intercepts Mouse Clicks with JavaScript

Step 5: Display the Requested Content with Ajax

Now that we’ve written the code that identifies on the client side which link was clicked, we can alter the employee info content section through some Ajax code. We know, according to our PHP code, that the unique employee information is contained inside of separate include files whose names match the query string values that we’re extracting. This is done by design, because it allows us to easily load the required page without any further tests.

Here’s how our loop code will look after adding the new line of code that will trigger our Ajax request:

for (i=0;i<myLinksCollection.length;i++) {
	myLinksCollection[i].onclick = function() {
		if (this.href.indexOf("view=") !== -1) {
			var clickedHREF = this.href;
			var clickedView = clickedHREF.split("view=");
			ajaxInitiate(clickedView[1]+'.html');
			return false;
		}
	}
}

The new line is actually a function call that takes one argument. The argument is the extracted query string value plus “.html”, which amounts to a file name called “employee2.html” (or whatever employee was clicked).

The Ajax code that opens the desired file is complex, and an explanation of it is far beyond the scope of this simple tutorial. You’re welcome to check out my previous tutorials on Ajax using raw JavaScript that discuss in detail the code required:

Further reading on raw Ajax:

Of course, you could also implement Ajax code from your favourite JavaScript library, which would be a much easier method for those familiar with Ajax syntax in different libraries.

The final line of code that’s needed, which is added to our Ajax function, is a line that adds the content of the requested file into the “employee-info” <div>. The full Ajax code, plus the line that inserts the content of the file is included in the final demo page. The page also includes a “listener” function that allows multiple events to be inserted on the same page, and doesn’t run the code until the page is finished loading, similar to jQuery’s $(document).ready function.

View Final Demo

Conclusion — Plus Bonus Tutorial

This tutorial has served to demonstrate, using a real, albeit simple, Ajax application, the theoretical steps involved in creating a web page or web-based application with progressive enhancement. Progressive enhancement has its drawbacks, which I won’t go into detail about here, but I think it’s a technique worth considering for all future JavaScript development.

You may have noticed, however, that the final version of the application we built has a major weakness: When JavaScript is enabled, there is no way to link directly to a particular employee (like you would be able to do with the PHP-only version). Next week, I’ll post a quick follow-up bonus tutorial to demonstrate how to add deep linking that doesn’t require you to disable JavaScript.

Related posts:

  1. Ajax From the Ground Up: Part 2 – Sending Data to the Server
  2. Deep Linking in JavaScript and Ajax Applications
  3. Ajax From the Ground Up: Part 1 – XMLHttpRequest



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

ASP.NET for PHP Developers

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:35 am

This tutorial, for PHP developers, will provide you with an introduction to ASP.NET using the C# language. If you’ve wondered what ASP.NET is about, this tutorial will strive to answer at least some of your questions. Even if you’re an ardent open-source fan, ASP.NET contains some techniques and features that are useful to know about. And, as some might say, it’s good to know your enemy!

Tutorial Details

  • Technology: ASP.NET (C#)
  • Difficulty: Advanced
  • Estimated Completion Time: 1 hour
  • Part: 1 of 2

Before you Start

ASP.NET is no longer a Microsoft-only technology. Thanks to the hard work of the Mono project contributors, ASP.NET can be used on Linux and Mac platforms, as well as Windows.

/> Mono Project />

They’ve even created an IDE within which you can write your web and desktop applications. Download a copy of MonoDevelop and install it. We’ll be using that IDE for this tutorial, as it offers some useful features to speed up development time. However, just like PHP, ASP.NET applications can be written in nothing more complex than a text editor.

ASP.NET and C# at a Glance

ASP.NET is Microsoft’s web development technology framework, and was originally designed as a replacement for the old Active Server Pages technology. If you’ve used classic ASP (normally using VBScript) you’ll find parts of ASP.NET very familiar. ASP.NET can be used with a wide range of programming languages, including VB.NET (the new version of Visual Basic), J# and C#. This tutorial will be using C#.

ASP.NET is available in two flavours:

  • ASP.NET WebForms: the original framework allowing developers to create web applications using many of the same techniques used in .NET Windows desktop applications
  • ASP.NET MVC: a newer framework offering Model-View-Controller architecture and more control over client-side code

As this tutorial is aimed at PHP developers, who a lot of the time prefer to get "closer to the metal," I won’t be using either of these. Instead, I’ll be rolling my own application based on the features of the .NET runtime and C# languages, with ASP.NET as a wrapper, rather than a framework as it was intended to be. Don’t worry if that doesn’t make sense, just continue with the tutorial and you’ll see what I mean.

I’m also running MonoDevelop in Xubuntu Linux, but it should work the same on other platforms.

Important: ASP.NET is very much built on object-oriented programming (OOP). If you have no experience with OOP I strongly suggest you read this introduction to OOP in PHP. You’ll need to understand words like "class", "instance", "method", "property" and "inherit".

Five Things to Watch Out For

C# can confuse PHP developers, especially if you switch between the two languages on a regular basis. Here’s my top five gotchas.

  1. String concatenation

    In C# string concatenation is done with "+" rather than ".".

    // PHP
    "This is part 1 " . "and this is part 2!";
    // C#
    "This is part 1 " + "and this is part 2!";
  2. Referencing class methods and properties

    To call a method or property in PHP you’d use:

    $myclass = new MyClass();
    $var = $myclass->var;
    $myclass->doMethod();

    In C# you use a "." instead of "->", like this:

    MyClass myclass = new MyClass();
    string var = myclass.var;
    myclass.doMethod();
  3. Strong typing

    C# is a strongly-typed language, so you can’t, for example, just treat strings as integers like you can in PHP. In PHP, this is true:

    "1" == 1

    In C# it would cause an error. You need to convert one of the elements:

    Convert.ToInt32("1") == 1

    Or:

    "1" == 1.ToString()

    You’ll find yourself using that .ToString() method a lot…

  4. Methods return types

    With PHP, the type of the return value of a function or method doesn’t matter. In C# methods must declare the type of value they return:

    protected string MyStringMethod()
    {
    	return "I am a string";
    }
    protected int MyIntMethod()
    {
    	// return an int
    	return 100;
    }

    Methods that don’t return anything must have the void keyword:

    protected void MyMethod() {  }
  5. Scope

    Watch out for the scope of methods and properties. The three important keywords are "public", "protected" and "private". These work the same as they do in PHP.

    // this is public, can be called from anywhere
    public int MyInt;
    // this is protected, can only be called by the declaring class and any descendents
    protected int MyInt;
    // this is private, can only be called by the declaring class
    private int MyInt;

Creating your first ASP.NET Page

Open MonoDevelop.

MonoDevelop

Choose "Start a New Solution" in the main screen. A "solution" is a collection of one or more related projects. They can be projects of different types, for example a web application and a desktop application that works with it, and perhaps a web service as well.

Create a new solution

Select C# ASP.NET Web Application, type a name for your application (I’m using "WebApplication1") and choose a location for your solution then click "Forward". Ignoring the options on the next page, click "OK" and a new ASP.NET application will be created at the location you specified.

The IDE will open the default page, called "Default.aspx" and you’ll see the following code:

<%@ Page Language="C#" Inherits="WebApplication1.Default" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head runat="server">
	<title>Default</title>
</head>
<body>
	<form id="form1" runat="server">
		<asp:Button id="button1" runat="server" Text="Click me!" OnClick="button1Clicked" />
	</form>
</body>
</html>

You’ll see it’s mostly standard HTML, with a few extra bits. There’s already a <form> element with a strange <asp:Button> element inside it. Let’s try running the application.

Running your Application in Debug Mode

MonoDevelop can run your ASP.NET application with one button. Rather than having to set up a local development server and configure your app, simple press F5. A development server will be launched on a non-standard port and you’ll see your application open in a browser. Note port 8080 in the URLs below.

Note: In Xubuntu I had to manually install the Mono web server named XSP2 with this command: sudo apt-get install mono-xsp2.

So, press F5 to run the application, you’ll hopefully see a browser window open that looks like this.

Default page

Clicking the button shows this.

Default page - button clicked

What you’ve just done is clicked an ASP.NET button, which automatically wires-up events to server-side code. In other words, when the button is clicked the page knows about it – this is one of the main advantages of the WebForms framework, it works much like desktop application development. But we’re not going to use this feature as it leads to a world of pain (which is explained in Part 2 of this tutorial). Instead we’re going to do things a little more manually. It’s the PHP way.

So, we’re going to change this code slightly as we don’t want to use the WebForms framework. Edit the opened "Default.aspx" by removing the <form> element and adding a <h1> element so the code is:

<%@ Page Language="C#" Inherits="WebApplication1.Default" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head runat="server">
	<title>Default</title>
</head>
<body>
	<h1>This is the text</h1>
</body>
</html>

Browsing your Solution

On the left hand side of the IDE there are three views of the application. The first view is Classes, and shows the class diagram of the application.

View of the application classes

The second view is Solution, and shows the projects and files which comprise the entire solution (you can have multiple projects in a solution).

View of the solution

And there’s also a Files view which shows you all the directories and files in the solution, including the "WebApplication1.csproj" file which is the C# project file for our application.

View of the files

Choose the Solution view. You’ve already got "Default.aspx" open, which is the page you’ve just edited. Double-click the file called "Default.aspx.cs" to open it in the IDE. Here’s the first important lesson.

Code-behind Files

The "Default.aspx.cs" file is a code-behind file. Rather than peppering your .html (or in this case .aspx, and for PHP applications .php) files with server-side code, it’s possible in ASP.NET to put all your server-side code in a code-behind file. In the same way you don’t have to put all your CSS and JavaScript in every HTML page but can put them in separate files which are included, you can do that with ASP.NET code.

The neat thing is these code-behind files are automatically included in your .aspx page. (That’s why the code-behind is called "Default.aspx.cs", so it is associated with the "Default.aspx" file.) The code of "Default.aspx.cs" is pretty short. Here’s an explanation, line-by-line:

// reference the System namespace
using System;
// reference the System.Web namespace
using System.Web;
// reference the System.Web.UI namespace
using System.Web.UI;
// put this class in the WebApplication1 namespace
namespace WebApplication1
{
	// declare a new partial class called "Default" which inherits (":") from System.Web.UI.Page
	public partial class Default : System.Web.UI.Page
	{
		// NOTE: we're going to delete this method
		// declare a public void method which is called when button1 is clicked
		public virtual void button1Clicked (object sender, EventArgs args)
		{
			// set the text
			button1.Text = "You clicked me";
		}
	}
}
A Note About Namespaces

Wikipedia defines a namespace as:

a namespace is an abstract container providing context for the items (names, or technical terms, or words) it holds and allowing disambiguation of homonym items having the same name (residing in different namespaces)

Which is pretty complicated. I mean, homonym? I see namespaces as virtual directories for classes. Just as in a file system you can’t have two files called "test.txt" in the same folder, you can’t have two classes with the same name in a single namespace. Namespaces can be nested, just like directories (hence the System, System.Web, System.Web.UI references in the code above) and have to be referenced from your ASP.NET code-behind files so you can use the classes contained in them.

You’ll also have noticed that our application is in the namespace WebApplication1, which means any code we write is within that namespace. We’ll use that to our advantage later on.

A Note About Pages

In ASP.NET, a web page is actually a class. It belongs in a namespace in a project, and inherits from the System.Web.UI.Page class. In the code-behind file this is expressed as:

public partial class Default : System.Web.UI.Page { ... }

(The "Partial" keyword means that part of this class is also in another file, namely the "Default.aspx.designer.cs" file. As that file is automatically-created and maintained we don’t need to worry about it.)

Your "Default.aspx" inherits from the page we’ve just created, using this code:

<%@ Page Language="C#" Inherits="WebApplication1.Default" %>

In the "Page" declaration we’re setting the language for the page and the class it inherits from, in this case WebApplication1.Default. That means that if we create another page, for example "Contact.aspx", the "Page" declaration and code-behind file would be:

// code-behind file
public partial class Contact : System.Web.UI.Page { ... }
// .aspx page
<%@ Page Language="C#" Inherits="WebApplication1.Contact" %>

This can be hard to get your head round to start with, but by the time you’ve created a few pages it will be second nature.

Creating Some Custom Functionality

We’re now ready to write something ourselves. Remove the "button1Clicked" method and in it’s place add the following code:

protected void Page_Load(object sender, EventArgs e)
{

}

The "Page_Load" method is one of several life-cycle events that get automatically called when the page’s Load event fires. There are quite a few events that happen in the life-cycle of an ASP.NET page, see the full list here. Generally the "Page_Load" method is where you put anything you want to happen when that page is loaded (security checks, fetching data from a database for display, loading an advert etc.). We’re going to declare a new method for the page called "SetText" which is called from the "Page_Load" method;

protected void Page_Load(object sender, EventArgs e)
{
	SetText();
}

protected void SetText()
{
	headertext.InnerHtml = "This is the changed text";
}

You can guess this sets the "InnerHtml" property on the element "headertext" to "This is the changed text". But what is "headertext"? Go back to "Default.aspx" and modify the <h1> element to this:

<h1 id="headertext" runat="server">This is the header</h1>

Then press F5 to run the application. If everything works OK you should see this:

Header text changed

Check the source of the page. You’ll see the runat="server" attribute has gone. What just happened? We turned an HTML control into a server-side control, that’s what.

Server-side Controls

My favourite feature of ASP.NET by far is the ability to turn almost any standard HTML control (elements are also known as controls) into a server-side control. This means that ASP.NET knows what the control is, and can change its properties and run methods on it from the code-behind page. In our example above, the <h1> element had two attributes added:

<h1 id="headertext" runat="server">This is the text</h1>

Now in our code-behind file ("Default.aspx.cs") we could access the control and change its properties.

headertext.InnerHtml = "This is the changed text";

One of the best features of using an IDE such as MonoDevelop rather than a text editor is Intellisense, which gives you a menu of options as you type. Properties of objects, system classes and methods, custom classes and methods, it’s all there:

Intellisense

(I believe "Intellisense" may be a Microsoft trademark, but I’m unsure what MonoDevelop call it!)

There are loads of other properties available for different controls (for example an input type="text" control has a "value" property), and they all appear in a list as you type. There’s also a massively useful property called Visible. This sets whether the control is visible in the source. Look at this example:

<div id="notloggedin" runat="server">
	<h1>You are not logged in</h1>
	<p>Please <a href="/login">log in to the site here</a>.</p>
</div>

<div id="loggedin" runat="server" visible="false">
	<h1>Thanks, you've logged in</h1>
	<p>Welcome back, user!</p>
</div>

Notice I’ve manually added the visible="false" for the loggedin element. When used with this server-side code we can make each <div> control visible or invisible.

protected void Page_Load(object sender, EventArgs e)
{
	User currentuser = new User();
	currentuser.CheckSecurity();
	if (currentuser.IsLoggedIn)
	{
		notloggedin.Visible = false;
		loggedin.Visible = true;
	}
}

You can imagine how easy that makes configuration of pages, and how much cleaner your code can be.

How about form elements? Let’s make a <select> control server-side:

<select name="genre" id="genre" runat="server">
	<option value="1">Jazz</option>
	<option value="2">Blues</option>
	<option value="3">Rock</option>
	<option value="4">Pop</option>
	<option value="5">Classical</option>
</select>

Selecting a particular option is as easy as this:

int chosengenre = 4;
genre.Items.FindByValue(chosengenre.ToString()).Selected = true;

Told you you’d use the ToString() method a lot. The <select> control Items property also has a useful method called FindByText() which does this:

string chosengenre = "Blues";
genre.Items.FindByText(chosengenre).Selected = true;

Intellisense will even give you a list of controls it finds, so you don’t even need to remember what you called all those different text boxes. As we’re using normal HTML controls converted to be server-side (rather than true ASP.NET controls) we lose some functionality, but nothing we can’t live without. In Part 2 of the tutorial we’ll be using a true ASP.NET control. But first, a word about configuration.

The Web.config File

In your PHP applications you have not doubt had a configuration file named "config.php" or similar. ASP.NET has a special file type for configuration files called ".config" which are automatically disabled from public view. Your ASP.NET application already has a "Web.config" file, open it from the Solution view in MonoDevelop.

Web.config file

You’ll see it’s a standard XML file. We’re going to add some application-wide settings, so edit the file adding this code just above </configuration>

  <appSettings>
    <add key="ApplicationName" value="My first ASP.NET Application"/>
    <add key="Developer" value="Chris Taylor"/>
  </appSettings>

Going back to our "Default.aspx.cs" file we reference the System.Configuration namespace:

using System;
using System.Web;
using System.Web.UI;
using System.Configuration;

And we can now access our "ApplicationName" setting using:

ConfigurationSettings.AppSettings["ApplicationName"];

So to jazz things up a bit, let’s try:

headertext.InnerHtml = "Welcome to " + ConfigurationSettings.AppSettings["ApplicationName"] + " by " + ConfigurationSettings.AppSettings["Developer"];

F5 to run the application and you’ll see:

Application settings in use

As you’ll appreciate, this gives you a huge amount of power over site-wide settings.

Sessions, Cookies and Post/Get Parameters

You can’t get very far developing web applications without dealing with Sessions, Cookies and Post or Get parameters. ASP.NET handles these pretty well, with a couple of gotchas.

Sessions

To get the session ID use: Session.SessionID. Warning: if you don’t store anything in the session the SessionID property *may* change on every page request. Yes, I know, it’s madness. The fix is to store something in the session.

To store something in the session use: Session["var"] = "value";.

To retrieve something from the session use: string value = Session["var"];.

Cookies

To store something in a cookie use: Response.Cookies["name"].Value = "value";.

To retrieve something from a cookie use: string value = Request.Cookies["name"].Value;.

Note the different use of Request and Response and the fact you have to set the Value property. You can also set other properties of each cookie (check Intellisense for full details):

Request.Cookies[0].Domain = "http://domain.com";
Request.Cookies[0].Expires = DateTime.Now.AddMonths(1);
Request.Cookies[0].Secure = true;

In that last example you got your first look at the DateTime class. This, in a word, is fantastic. Although it lacks the "numbery-ness" of PHPs time() function, it provides a massive array of chainable methods to create and modify datetimes. This is a good introductory article.

Post/get Parameters

These are contained, like cookies, in the Request object. That object contains loads of stuff useful for application developers, and can be roughly compared with PHPs $_SERVER variable. Here are a few other useful things:

NameValueCollection headers = Request.Headers;
bool issecure = Request.IsSecureConnection;
string url = Request.RawUrl;
string scriptname = Request.ServerVariables["SCRIPT_NAME"];
string useragent = Request.ServerVariables["HTTP_USER_AGENT"];

For a full list of Request.ServerVariables this is a good resource.

Getting back to the point; to retrieve something from a post parameter (submitted by a form) use: string value = Request.Form["param"];.

And to retrieve something from a get parameter (in the querystring) use: string value = Request.QueryString["param"];.

For session variables, cookies, and post and get properties, always check the parameter is not null before trying to access it. This is the equivalent of PHPs isset() function. C# is very unforgiving when it comes to null objects, so in general it always pays to check something exists before trying to use it.

// session variable
if (Session["var"] != null) { ... }
// cookie
if (Request.Cookies["cookiename"] != null) { ... }
// post parameter
if (Request.Form["param"] != null) { ... }
// get parameter
if (Request.QueryString["param"] != null) { ... }

You can also put multiple parts in an if statement. So to check if the user has a cookie with the name "loggedin" with the value "true", this will work:

if (Request.Cookies["loggedin"] != null && Request.Cookies["loggedin"].Value == "true") { ... }

Just like PHP, if the first part of the if statement fails the second one is not run.

Quick Reference

Here’s a table of common code snippets in PHP with their equivalents in C#.

/> />

Click image to view at full size.

Over to You…

You now have the knowledge to build basic ASP.NET pages, and use the MonoDevelop IDE to create simple sites. In part two of this tutorial, you’ll be shown some of the more advanced features of ASP.NET including:

  • MasterPages: page layouts on steroids
  • Data sources and data binding
  • Custom classes: getting all object-oriented on yo’ ass
  • Custom controls: reuse to the max, baby!

Write a Plus Tutorial

Did you know that you can earn up to $600 for writing a PLUS tutorial and/or screencast for us? We’re looking for in depth and well-written tutorials on HTML, CSS, PHP, and JavaScript. If you’re of the ability, please contact Jeffrey at nettuts@tutsplus.com.

Please note that actual compensation will be dependent upon the quality of the final tutorial and screencast.

Write a PLUS tutorial



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

We Need a Weekly Writer

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:35 am

A weekly writer position has just opened on Nettuts+. If interested in filling the spot, your duties, should you choose to accept them, will be to submit one design-focused article each week. Mostly, you’ll be free to choose your topics, as long as they’ll appeal to our audience. We’re in need of articles which focus on advanced CSS, HTML 5, the intricacies of a design, typography, adding depth to a layout, round-ups, and anything else you can think of!

What’s Required

  1. A solid grasp of the English language.
  2. Consistency. You must submit one article each week – the day to be determined between the two of us.

I’m Interested. What Next?

You have a few options:

  1. Email nettuts@tutsplus.com with your first tutorial idea. I’ll either okay it or request a different topic.
  2. Take the initiative and prepare a high-quality design article, and submit it to nettuts@tutsplus.com.
  3. Leave a comment with any questions that you might have.

When you’re ready to begin writing, be sure to take a look at our formatting instructions.

How Much Does the Job Pay?

You’ll be offered $150 per article for your first few submissions. However, once you’ve consistently submitted a weekly article for a month, your pay will be upped to $200 per article.



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

Learn how to Write Lightning-Fast Code in 4 Minutes: Screencast

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:35 am

We all know the benefits of using snippets and bundles to speed up our coding, but what if we could take things a step further, and turn a complex html structure into something as simple as a CSS selector? Well, thanks to a new project, called Zen-Coding, we can do this very thing!

In this four-minute video quick tip, I’ll demonstrate how.

Download Zen-Coding

Other Viewing Options

Update

Within the comments below, I made a suggestion that it would be neat if we could also paste in the generic “lorem” text, like so:

div#header>p>lorem

This would generate something like so:

<div id="header">
	<p>
		Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
	</p>
</div>

Unfortunately, I didn’t know how to do it. But luckily, the author of Zen-Coding, Sergey, was able to help me. I recorded a sixty second screencast showing you how to allow for this. You can watch it here.

Additional Resources

Write a Plus Tutorial

Did you know that you can earn up to $600 for writing a PLUS tutorial and/or screencast for us? We’re looking for in depth and well-written tutorials on HTML, CSS, PHP, and JavaScript. If you’re of the ability, please contact Jeffrey at nettuts@tutsplus.com.

Please note that actual compensation will be dependent upon the quality of the final tutorial and screencast.

Write a PLUS tutorial



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

Getting the Hang of GitHub

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:35 am

A project is always more fun when you’ve got friends working with you, but how can do it when working on a coding project? I’ll keep my keyboard to myself, thanks.

Enter GitHub. With this web service, you can share your coding projects and collaborate with ease!

Disclaimer

This tutorial will assume that you’re familiar with Git, arguably the best distributed version control software there is. Already lost? Don’t worry: read my introduction to Git to get up and running. Then come on back here and find out all about GitHub, the social network for developers!

Getting Started

Of course, you’ll need a GitHub account if you’re to experience any of the social coding goodness. Let’s do that right now. Head over to the GitHub website and click “Pricing and Signup” at the top.

Creating an Account

GitHub Home Page

There are several different plans you can use, depending on your needs. Right now, the free “open source” account is all we want; so let’s click “Sign Up.” It’s your standard sign up page; enter your name, email address, and password. You’ll also need an SSH public key; I explained how to get one in my previous article:

Open up your terminal and type this: ssh-keygen -t rsa -C "your@email.com". The t option assigns a type, and the C option adds a comment, traditionally your email address. You’ll then be asked where to save the key; just hitting enter will do (that saves the file to the default location). Then, enter a pass-phrase, twice. Now you have a key; let’s give it to GitHub.

First, get your key from the file; the terminal will have told you where the key was stored; open the file, copy the key (be careful not to add any newlines or white-space).

Once you have the key, just paste it into correct field. Like it says, you don’t have to do this now; you can add a key later. Then click the button to agree and sign up.

Warming up to the Interface

When you first log in, you’ll see the dashboard; it’s something like this:

Signed in to GitHub

In the top right corner, you can see a toolbar with options for controlling your account. There are also some links for getting around GitHub, as well as the search box.

Tools

The main panel offers a number of actions; later, it will be used to keep you up to date on projects you’re interested in. And on the right of that, your own repositories will be listed. Remember, in Git, repositories are the containers that hold all the code and history related to one project. /> When you have a chance, browse around the Account Settings, It’s all what you’d expect, so I won’t go over it; but if you plan to be an avid githubber, you should probably fill out your profile and see what else is in there.

It’s all about Repos

Really, the whole purpose of GitHub is making Git repositories available to the world; therefore, it follows that working with repositories (or repos, as they’re often called) is something you should be comfy with.

Creating a repository is pretty simply. On the dashboard, click “New Repository” on top of your repo list (which is currently empty).

New Repository Button

Three textboxes await you here; your repository will need a project name, a description, and the URL for the project’s website. If you’ve upgraded to one of the paid plans, you can choose whether this should be available to the public or not. When you’re done, hit “Create Repository.” /> GitHub will now give you instructions for hooking the GitHub repository up to one of your local Git repository. If you’re familiar with Git, this should be pretty old hat to you. The important part is the last two lines of either the “Next Steps” or “Existing Git Repo?”

  git remote add origin git@github.com:aburgess/My-First-GitHub-Repo.git
  git push origin master

The git-remote command allows you to track other repositories and keep them “synchronized” with your local one. In our case, we’re tracking the repository on GitHub from our local repository. So, that first line is adding a remote; we’re calling it “origin” and giving it a URL. This is a private URL that only you can use to read and write to your GitHub repo.

In the second line, we’re using the git-push command to send everything in the master branch out to origin (our GitHub repository). Notice that none of the coding or project work (creating files, etc) is done on GitHub. That’s all local work; you should work on your project just as you would a plain vanilla local repo. However, you’ll regularly push it to GitHub with that second line.

So let’s say you’ve been working on a project for a while and you’ve been pushing your commits to a GitHub repo. What project meta-data can we explore on GitHub? Let’s look at the jQuery repository.

jQuery's GitHib Page

See that toolbar near the top, just under the project name? These options allow us to drill down into some of the Git information that we uncovered in the command line last time. Right now, we’re on the source tab. It starts with drop-downs for the repo branches and tags; use these to view the different branches or tags. There’s also some project meta-data; we have the project name, URL, and cloning URLs (These URLs are read-only).

Jquery Meta

Then, you can see the information about the latest commit: author, time, comment, and hashes. Under that, you’ve got a file browser, which shows you the latest versions of all the files in the project, as well as their age and the message of the last commit they were changed in.

Files

If the project has a readme file, it’s displayed under the file browser.

Commits

Let’s switch to the commits tab at the top and see what it holds.

Commit History

As you might expect, we can view a backlog of commits on this project. Clicking the commit hash (on the right of the commit entry) will let you view what was changed in the commit.

Commit Details

Green lines (prepended with a ‘+’) are additions and red lines (prepended with a ‘-’) are deletions. /> You can subscribe to the commit history RSS feed of any repo if you want to keep track of it.

Network

Network Graph

The next tab is the network tab; this shows you a graphical representation of the repository’s history. This graph is drawn from the perspective of the committer ‘jquery.’ Each commit only show up once, so commits not on jQuery’s line are not in jQuery’s repo. This way, we can see what commits other people have made that we don’t have. It’s incredibly useful, but somewhat confusing if you’re not grokking git all the way. If you want to know more, check out GitHub’s blog post about it. /> The network tab also offers a list of members (people who have forked the repo) and a feed as well.

Network

Graphs

The graphs tab offers you a number of different view of your project from a graphical perspective. It’s really just meta-data, but it may offer some interesting insights:

Network Graphs

Fork Queue

If you’re viewing a repository you own, you’ll see another tab, Fork Queue. This shows you the same information you saw in the network graph, but differently. From here, you can choose which commits to apply to your repository, on whichever branch you want. For more on the fork queue, check out this post and video on the GitHub blog.

You can enable three other tabs from the repo admin panel: Wiki, Issues, and Downloads. These allow you to create a wiki, track bugs in your project, and offer downloadable copies of your repo. They are all pretty intuitive.

The Social Side

Like the tagline says, GitHub is all about social coding. Although we’ve already seen a few of them, it’s time to check out the rest of the social features of GitHub.

Exploring Other Repositories

If you ever have some free time and want to dig into a coding project for a while, GitHub is the ideal place to go. Let’s see how we can find interesting projects. /> Start by clicking the “Explore GitHub” to the right of the search box. We’ve got a couple of tabs to choose from here:

  • Repositories
  • Search
  • Timeline
  • Languages
  • Changelog

For me, the most interested is the languages tab; choose your language and check out the most watched and most forked projects for that language.

JS Repos

So let’s say you’ve found a project you’re interested in; what next?

Interacting with other repositories

Social Tools

When viewing a repo, the social tools are underneath the search bar. Starting at the right, you can see how many people have forked the project, and how many people are watching it. We’ve been talking about forking and watching, but it’s time to get some formal definitions on the table.

  • Forking a project means that you’ll get a copy of that repository that you can play with all you want.
  • Watching a project means that any actions on it will show up in your news feed, on your dashboard.

I’m taking a rabbit trail, but you can do more than watch projects; if you find particularly talented developers, you can follow them to track their every keystroke. Just click the follow button on their profile page.

Resig's Profile

You can also send them a message if you’d like. Now we’re talking social network!

Back on the social toolbar, the next button (moving left) predictably lets you download the source code of the project. The next button let’s you fork the repo, and the last one let’s you watch it (you saw that coming).

If you’re looking at a repository of your own, there will be two other buttons in the row: ‘Pull Request’ and ‘Admin.’ We’ll come back to ‘Pull Request’ in a minute; feel free to look around the admin panel.

Admin Panel

It should be obvious most of the basic features are:

  • change your default branch
  • create a project page (we’ll come back to this)
  • turn those extra tabs (wiki, downloads, issues) on/off
  • rename or delete the repo

When you’re done, click your project name at the top to return to the source view.

Our scenario before introducing the social tools was that you had found a repo you want to work on. The first thing to do is fork the project; just click that ‘fork’ button. Now you’ve got a copy of the repo in your account. Let’s clone the repo to your computer, using your private clone URL. Doing this automatically sets up a remote called origin, as we discussed earlier. That origin is not the original project you forked (you can’t write to that) but your forked copy of it. You do need to set up a remote for the original project, however; do that with this command:

git remote add upstream [original project's public clone URL here]
git fetch upstream

That last line gets the latest branches from the upstream remote and stores them in tracking branches. /> Now’s the time to begin coding. As you make commits, they’ll show up the the original project’s network graph, because they aren’t in that repo. They’ll also show up in that repo’s fork queue, so the owner can pull them in if he/she likes them. However, you can request that they be pulled in. That’s what the pull request button is for: you can send the owner of the original repo a message.

Pull Request
Pull Received

It’s nothing fancy; really, it just lets them know that they should check out their fork queue. /> But what if they’ve made changes to the project since you forked it? It’s best practice to integrate their changes into your repo before requesting they pull in your changes. You can easily get their updates using that upstream remote you made! Just run this command:

git fetch upstream master
git merge upstream/master

As we just saw a moment ago, the fetch command gets the latest content from the specified remote, in this case our upstream remote. Also, we’re specifying that we only want the master branch. The second line merges the specified branch with the one we’re on. We’re on the master branch of our local repo, and we want to merge our tracking branch upstream/master.

Note: If this tracking branch talk is confusing you, check out this article on gitready.com. Basically, tracking branches are simply branches that keep track of where other repos with the same project are in relation to your repo. To see all your branches run the command git branch -a. Here’s what I get for a forked repo:

git branch -a

If you’re lazy, you could run the command git pull upstream master. This does both the fetch and merge commands at once. However, this could cause merging problems, so stay away from it! /> Once you’ve merged the changes, you can request a pull with peace of mind, knowing your fork is up to date with the upstream repository.

User and Project Pages

Github gives you a rather unique ability to create a personal website from a GitHub repository. My GitHub username is andrew8088, so if I created a repo named andrew8088.github.com, it would be published at http://andrew8088.github.com. Simple as that!

You can also create a site for a project; this is a bit more complicated, but GitHub makes it easy to get a generic, single page explaining your project. Click that admin button in your social toolbar. Then click ‘Generate Your Project Page’ under ‘Repository Information.’

project page

Then, just fill in all the fields:

project page form

And click “Create Page.”

If you want to customize your page further or add pages, you can follow the advanced instructions at the GitHub Pages Help.

When it’s only a Snippet

Ever been talking code with a friend online and wanted to share a snippet? It happens to all of us, and GitHub offers a quick way to do it. It’s Gist. They say it best:

Gist is a simple way to share snippets and pastes with others. All gists are git repositories, so they are automatically versioned, forkable and usable as a git repository.

Click the Gist link to the left of the search box, or go to Gist.GitHub.com. You can simply paste in your snippet or type it all in. Give it a file name and let them know what language your writing (for syntax highlighting). Then click ‘Create Public Gist.’ Copy the URL and share!

gist

Keeping in the Know

Any social network can become addicting, and GitHub is no different. If you want to keep track of your repos on the go, check out iOctocat, the GitHub app for iPhone and iPod Touch.

iOctocat

Conclusion

Well, it’s been a whirlwind tour, but I hope you’re feeling more familiar with what could be the best code hosting/sharing site on the web! Ever used GitHub? Have a better option? Hit the comments!

Write a Plus Tutorial

Did you know that you can earn up to $600 for writing a PLUS tutorial and/or screencast for us? We’re looking for in depth and well-written tutorials on HTML, CSS, PHP, and JavaScript. If you’re of the ability, please contact Jeffrey at nettuts@tutsplus.com.

Please note that actual compensation will be dependent upon the quality of the final tutorial and screencast.

Write a PLUS tutorial



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

How to Build a Newspaper Website with a Grid: New Plus Tutorial

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:35 am

In this week’s Plus video tutorial, you’ll learn how to utilize a grid to create a simple newspaper-like website. Along the way, you’ll learn helpful techniques, such as easy ways to target IE7 and IE6 with only a single character, using the 960 grid system, and even using CSS3 to create columns! It’s an hour long; ready to dig in? Join Plus!

/> Final Preview />

Join Tuts Plus

NETTUTS+ Screencasts and Bonus Tutorials

For those unfamiliar, the family of TUTS sites runs a premium membership service called “TUTSPLUS”. For $9 per month, you gain access to exclusive premium tutorials, screencasts, and freebies from Nettuts+, Psdtuts+, Aetuts+, Audiotuts+, and Vectortuts+! For the price of a pizza, you’ll learn from some of the best minds in the business. Join today!

Write a Plus Tutorial

Did you know that you can earn up to $600 for writing a PLUS tutorial and/or screencast for us? We’re looking for in depth and well-written tutorials on HTML, CSS, PHP, and JavaScript. If you’re of the ability, please contact Jeffrey at nettuts@tutsplus.com.

Please note that actual compensation will be dependent upon the quality of the final tutorial and screencast.

src="http://miscfiles.s3.cdn.plus.org/banners/nettuts_728x90.jpg" alt="Write a PLUS tutorial" style="width:600px;" />



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

Techniques for Mastering cURL

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:35 am

cURL is a tool for transferring files and data with URL syntax, supporting many protocols including HTTP, FTP, TELNET and more. Initially, cURL was designed to be a command line tool. Lucky for us, the cURL library is also supported by PHP. In this article, we will look at some of the advanced features of cURL, and how we can use them in our PHP scripts.

Why cURL?

It’s true that there are other ways of fetching the contents of a web page. Many times, mostly due to laziness, I have just used simple PHP functions instead of cURL:

$content = file_get_contents("http://www.nettuts.com");

// or

$lines = file("http://www.nettuts.com");

// or

readfile("http://www.nettuts.com");

However they have virtually no flexibility and lack sufficient error handling. Also, there are certain tasks that you simply can not do, like dealing with cookies, authentication, form posts, file uploads etc.

cURL is a powerful library that supports many different protocols, options, and provides detailed information about the URL requests.

Basic Structure

Before we move on to more complicated examples, let’s review the basic structure of a cURL request in PHP. There are four main steps:

  1. Initialize
  2. Set Options
  3. Execute and Fetch Result
  4. Free up the cURL handle
// 1. initialize
$ch = curl_init();

// 2. set the options, including the url
curl_setopt($ch, CURLOPT_URL, "http://www.nettuts.com");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);

// 3. execute and fetch the resulting HTML output
$output = curl_exec($ch);

// 4. free up the curl handle
curl_close($ch);

Step #2 (i.e. curl_setopt() calls) is going to be a big part of this article, because that is where all the magic happens. There is a long list of cURL options that can be set, which can configure the URL request in detail. It might be difficult to go through the whole list and digest it all at once. So today, we are just going to use some of the more common and useful options in various code examples.

Checking for Errors

Optionally, you can also add error checking:

// ...

$output = curl_exec($ch);

if ($output === FALSE) {

	echo "cURL Error: " . curl_error($ch);

}

// ...

Please note that we need to use “=== FALSE” for comparison instead of “== FALSE”. Because we need to distinguish between empty output vs. the boolean value FALSE, which indicates an error.

Getting Information

Another optional step is to get information about the cURL request, after it has been executed.

// ...

curl_exec($ch);

$info = curl_getinfo($ch);

echo 'Took ' . $info['total_time'] . ' seconds for url ' . $info['url'];

// ...

Following information is included in the returned array:

  • “url”
  • “content_type”
  • “http_code”
  • “header_size”
  • “request_size”
  • “filetime”
  • “ssl_verify_result”
  • “redirect_count”
  • “total_time”
  • “namelookup_time”
  • “connect_time”
  • “pretransfer_time”
  • “size_upload”
  • “size_download”
  • “speed_download”
  • “speed_upload”
  • “download_content_length”
  • “upload_content_length”
  • “starttransfer_time”
  • “redirect_time”

Detect Redirection Based on Browser

In this first example, we will write a script that can detect URL redirections based on different browser settings. For example, some websites redirect cellphone browsers, or even surfers from different countries.

We are going to be using the CURLOPT_HTTPHEADER option to set our outgoing HTTP Headers including the user agent string and the accepted languages. Finally we will check to see if these websites are trying to redirect us to different URLs.

// test URLs
$urls = array(
	"http://www.cnn.com",
	"http://www.mozilla.com",
	"http://www.facebook.com"
);
// test browsers
$browsers = array(

	"standard" => array (
		"user_agent" => "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6 (.NET CLR 3.5.30729)",
		"language" => "en-us,en;q=0.5"
		),

	"iphone" => array (
		"user_agent" => "Mozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Version/3.0 Mobile/1A537a Safari/419.3",
		"language" => "en"
		),

	"french" => array (
		"user_agent" => "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; GTB6; .NET CLR 2.0.50727)",
		"language" => "fr,fr-FR;q=0.5"
		)

);

foreach ($urls as $url) {

	echo "URL: $url\n";

	foreach ($browsers as $test_name => $browser) {

		$ch = curl_init();

		// set url
		curl_setopt($ch, CURLOPT_URL, $url);

		// set browser specific headers
		curl_setopt($ch, CURLOPT_HTTPHEADER, array(
				"User-Agent: {$browser['user_agent']}",
				"Accept-Language: {$browser['language']}"
			));

		// we don't want the page contents
		curl_setopt($ch, CURLOPT_NOBODY, 1);

		// we need the HTTP Header returned
		curl_setopt($ch, CURLOPT_HEADER, 1);

		// return the results instead of outputting it
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

		$output = curl_exec($ch);

		curl_close($ch);

		// was there a redirection HTTP header?
		if (preg_match("!Location: (.*)!", $output, $matches)) {

			echo "$test_name: redirects to $matches[1]\n";

		} else {

			echo "$test_name: no redirection\n";

		}

	}
	echo "\n\n";
}

First we have a set of URLs to test, followed by a set of browser settings to test each of these URLs against. Then we loop through these test cases and make a cURL request for each.

Because of the way setup the cURL options, the returned output will only contain the HTTP headers (saved in $output). With a simple regex, we can see if there was a “Location:” header included.

When you run this script, you should get an output like this:

POSTing to a URL

On a GET request, data can be sent to a URL via the “query string”. For example, when you do a search on Google, the search term is located in the query string part of the URL:

http://www.google.com/search?q=nettuts

You may not need cURL to simulate this in a web script. You can just be lazy and hit that url with “file_get_contents()” to receive the results.

But some HTML forms are set to the POST method. When these forms are submitted through the browser, the data is sent via the HTTP Request body, rather than the query string. For example, if you do a search on the CodeIgniter forums, you will be POSTing your search query to:

http://codeigniter.com/forums/do_search/

We can write a PHP script to simulate this kind of URL request. First let’s create a simple file for accepting and displaying the POST data. Let’s call it post_output.php:

print_r($_POST);

Next we create a PHP script to perform a cURL request:

$url = "http://localhost/post_output.php";

$post_data = array (
	"foo" => "bar",
	"query" => "Nettuts",
	"action" => "Submit"
);

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// we are doing a POST request
curl_setopt($ch, CURLOPT_POST, 1);
// adding the post variables to the request
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);

$output = curl_exec($ch);

curl_close($ch);

echo $output;

When you run this script, you should get an output like this:

It sent a POST to the post_output.php script, which dumped the $_POST variable, and we captured that output via cURL.

File Upload

Uploading files works very similarly to the previous POST example, since all file upload forms have the POST method.

First let’s create a file for receiving the request and call it upload_output.php:

print_r($_FILES);

And here is the actual script performing the file upload:

$url = "http://localhost/upload_output.php";

$post_data = array (
	"foo" => "bar",
	// file to be uploaded
	"upload" => "@C:/wamp/www/test.zip"
);

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt($ch, CURLOPT_POST, 1);

curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);

$output = curl_exec($ch);

curl_close($ch);

echo $output;

When you want to upload a file, all you have to do is pass its file path just like a post variable, and put the @ symbol in front of it. Now when you run this script you should get an output like this:

Multi cURL

One of the more advanced features of cURL is the ability to create a “multi” cURL handle. This allows you to open connections to multiple URLs simultaneously and asynchronously.

On a regular cURL request, the script execution stops and waits for the URL request to finish before it can continue. If you intend to hit multiple URLs, this can take a long time, as you can only request one URL at a time. We can overcome this limitation by using the multi handle.

Let’s look at this sample code from php.net:

// create both cURL resources
$ch1 = curl_init();
$ch2 = curl_init();

// set URL and other appropriate options
curl_setopt($ch1, CURLOPT_URL, "http://lxr.php.net/");
curl_setopt($ch1, CURLOPT_HEADER, 0);
curl_setopt($ch2, CURLOPT_URL, "http://www.php.net/");
curl_setopt($ch2, CURLOPT_HEADER, 0);

//create the multiple cURL handle
$mh = curl_multi_init();

//add the two handles
curl_multi_add_handle($mh,$ch1);
curl_multi_add_handle($mh,$ch2);

$active = null;
//execute the handles
do {
    $mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

while ($active && $mrc == CURLM_OK) {
    if (curl_multi_select($mh) != -1) {
        do {
            $mrc = curl_multi_exec($mh, $active);
        } while ($mrc == CURLM_CALL_MULTI_PERFORM);
    }
}

//close the handles
curl_multi_remove_handle($mh, $ch1);
curl_multi_remove_handle($mh, $ch2);
curl_multi_close($mh);

The idea is that you can open multiple cURL handles and assign them to a single multi handle. Then you can wait for them to finish executing while in a loop.

There are two main loops in this example. The first do-while loop repeatedly calls curl_multi_exec(). This function is non-blocking. It executes as little as possible and returns a status value. As long as the returned value is the constant ‘CURLM_CALL_MULTI_PERFORM’, it means that there is still more immediate work to do (for example, sending http headers to the URLs.) That’s why we keep calling it until the return value is something else.

In the following while loop, we continue as long as the $active variable is ‘true’. This was passed as the second argument to the curl_multi_exec() call. It is set to ‘true’ as long as there are active connections withing the multi handle. Next thing we do is to call curl_multi_select(). This function is ‘blocking’ until there is any connection activity, such as receiving a response. When that happens, we go into yet another do-while loop to continue executing.

Let’s see if we can create a working example ourselves, that has a practical purpose.

Wordpress Link Checker

Imagine a blog with many posts containing links to external websites. Some of these links might end up dead after a while for various reasons. Maybe the page is longer there, or the entire website is gone.

We are going to be building a script that analyzes all the links and finds non-loading websites and 404 pages and returns a report to us.

Note that this is not going to be an actual Wordpress plug-in. It is only a standalone utility script, and it is just for demonstration purposes.

So let’s get started. First we need to fetch the links from the database:

// CONFIG
$db_host = 'localhost';
$db_user = 'root';
$db_pass = '';
$db_name = 'wordpress';
$excluded_domains = array(
	'localhost', 'www.mydomain.com');
$max_connections = 10;
// initialize some variables
$url_list = array();
$working_urls = array();
$dead_urls = array();
$not_found_urls = array();
$active = null;

// connect to MySQL
if (!mysql_connect($db_host, $db_user, $db_pass)) {
	die('Could not connect: ' . mysql_error());
}
if (!mysql_select_db($db_name)) {
	die('Could not select db: ' . mysql_error());
}

// get all published posts that have links
$q = "SELECT post_content FROM wp_posts
	WHERE post_content LIKE '%href=%'
	AND post_status = 'publish'
	AND post_type = 'post'";
$r = mysql_query($q) or die(mysql_error());
while ($d = mysql_fetch_assoc($r)) {

	// get all links via regex
	if (preg_match_all("!href=\"(.*?)\"!", $d['post_content'], $matches)) {

		foreach ($matches[1] as $url) {

			// exclude some domains
			$tmp = parse_url($url);
			if (in_array($tmp['host'], $excluded_domains)) {
				continue;
			}

			// store the url
			$url_list []= $url;
		}
	}
}

// remove duplicates
$url_list = array_values(array_unique($url_list));

if (!$url_list) {
	die('No URL to check');
}

First we have some database configuration, followed by an array of domain names we will ignore ($excluded_domains). Also we set a number for maximum simultaneous connections we will be using later ($max_connections). Then we connect to the database, fetch posts that contain links, and collect them into an array ($url_list).

Following code might be a little complex, so I will try to explain it in small steps.

// 1. multi handle
$mh = curl_multi_init();

// 2. add multiple URLs to the multi handle
for ($i = 0; $i < $max_connections; $i++) {
	add_url_to_multi_handle($mh, $url_list);
}

// 3. initial execution
do {
	$mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

// 4. main loop
while ($active && $mrc == CURLM_OK) {

	// 5. there is activity
	if (curl_multi_select($mh) != -1) {

		// 6. do work
		do {
			$mrc = curl_multi_exec($mh, $active);
		} while ($mrc == CURLM_CALL_MULTI_PERFORM);

		// 7. is there info?
		if ($mhinfo = curl_multi_info_read($mh)) {
			// this means one of the requests were finished

			// 8. get the info on the curl handle
			$chinfo = curl_getinfo($mhinfo['handle']);

			// 9. dead link?
			if (!$chinfo['http_code']) {
				$dead_urls []= $chinfo['url'];

			// 10. 404?
			} else if ($chinfo['http_code'] == 404) {
				$not_found_urls []= $chinfo['url'];

			// 11. working
			} else {
				$working_urls []= $chinfo['url'];
			}

			// 12. remove the handle
			curl_multi_remove_handle($mh, $mhinfo['handle']);
			curl_close($mhinfo['handle']);

			// 13. add a new url and do work
			if (add_url_to_multi_handle($mh, $url_list)) {

				do {
					$mrc = curl_multi_exec($mh, $active);
				} while ($mrc == CURLM_CALL_MULTI_PERFORM);
			}
		}
	}
}

// 14. finished
curl_multi_close($mh);

echo "==Dead URLs==\n";
echo implode("\n",$dead_urls) . "\n\n";

echo "==404 URLs==\n";
echo implode("\n",$not_found_urls) . "\n\n";

echo "==Working URLs==\n";
echo implode("\n",$working_urls);

// 15. adds a url to the multi handle
function add_url_to_multi_handle($mh, $url_list) {
	static $index = 0;

	// if we have another url to get
	if ($url_list[$index]) {

		// new curl handle
		$ch = curl_init();

		// set the url
		curl_setopt($ch, CURLOPT_URL, $url_list[$index]);
		// to prevent the response from being outputted
		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
		// follow redirections
		curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
		// do not need the body. this saves bandwidth and time
		curl_setopt($ch, CURLOPT_NOBODY, 1);

		// add it to the multi handle
		curl_multi_add_handle($mh, $ch);

		// increment so next url is used next time
		$index++;

		return true;
	} else {

		// we are done adding new URLs
		return false;
	}
}

And here is the explanation for the code above. Numbers in the list correspond to the numbers in the code comments.

  1. Created a multi handle.
  2. We will be creating the add_url_to_multi_handle() function later on. Every time it is called, it will add a url to the multi handle. Initially, we add 10 (based on $max_connections) URLs to the multi handle.
  3. We must run curl_multi_exec() for the initial work. As long as it returns CURLM_CALL_MULTI_PERFORM, there is work to do. This is mainly for creating the connections. It does not wait for the full URL response.
  4. This main loop runs as long as there is some activity in the multi handle.
  5. curl_multi_select() waits the script until an activity to happens with any of the URL quests.
  6. Again we must let cURL do some work, mainly for fetching response data.
  7. We check for info. There is an array returned if a URL request was finished.
  8. There is a cURL handle in the returned array. We use that to fetch info on the individual cURL request.
  9. If the link was dead or timed out, there will be no http code.
  10. If the link was a 404 page, the http code will be set to 404.
  11. Otherwise we assume it was a working link. (You may add additional checks for 500 error codes etc…)
  12. We remove the cURL handle from the multi handle since it is no longer needed, and close it.
  13. We can now add another url to the multi handle, and again do the initial work before moving on.
  14. Everything is finished. We can close the multi handle and print a report.
  15. This is the function that adds a new url to the multi handle. The static variable $index is incremented every time this function is called, so we can keep track of where we left off.

I ran the script on my blog (with some broken links added on purpose, for testing), and here is what it looked like:

It took only less than 2 seconds to go through about 40 URLs. The performance gains are significant when dealing with even larger sets of URLs. If you open ten connections at the same time, it can run up to ten times faster. Also you can just utilize the non-blocking nature of the multi curl handle to do URL requests without stalling your web script.

Some Other Useful cURL Options

HTTP Authentication

If there is HTTP based authentication on a URL, you can use this:

$url = "http://www.somesite.com/members/";

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

// send the username and password
curl_setopt($ch, CURLOPT_USERPWD, "myusername:mypassword");

// if you allow redirections
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
// this lets cURL keep sending the username and password
// after being redirected
curl_setopt($ch, CURLOPT_UNRESTRICTED_AUTH, 1);

$output = curl_exec($ch);

curl_close($ch);

FTP Upload

PHP does have an FTP library, but you can also use cURL:

// open a file pointer
$file = fopen("/path/to/file", "r");

// the url contains most of the info needed
$url = "ftp://username:password@mydomain.com:21/path/to/new/file";

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

// upload related options
curl_setopt($ch, CURLOPT_UPLOAD, 1);
curl_setopt($ch, CURLOPT_INFILE, $fp);
curl_setopt($ch, CURLOPT_INFILESIZE, filesize("/path/to/file"));

// set for ASCII mode (e.g. text files)
curl_setopt($ch, CURLOPT_FTPASCII, 1);

$output = curl_exec($ch);
curl_close($ch);

Using a Proxy

You can perform your URL request through a proxy:

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL,'http://www.example.com');

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

// set the proxy address to use
curl_setopt($ch, CURLOPT_PROXY, '11.11.11.11:8080');

// if the proxy requires a username and password
curl_setopt($ch, CURLOPT_PROXYUSERPWD,'user:pass');

$output = curl_exec($ch);

curl_close ($ch);

Callback Functions

It is possible to have cURL call given callback functions during the URL request, before it is finished. For example, as the contents of the response is being downloaded, you can start using the data, without waiting for the whole download to complete.

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL,'http://net.tutsplus.com');

curl_setopt($ch, CURLOPT_WRITEFUNCTION,"progress_function");

curl_exec($ch);

curl_close ($ch);

function progress_function($ch,$str) {

	echo $str;
	return strlen($str);

}

The callback function MUST return the length of the string, which is a requirement for this to work properly.

As the URL response is being fetched, every time a data packet is received, the callback function is called.

Conclusion

We have explored the power and the flexibility of the cURL library today. I hope you enjoyed and learned from the this article. Next time you need to make a URL request in your web application, consider using cURL.

Thank you and have a great day!

Write a Plus Tutorial

Did you know that you can earn up to $600 for writing a PLUS tutorial and/or screencast for us? We’re looking for in depth and well-written tutorials on HTML, CSS, PHP, and JavaScript. If you’re of the ability, please contact Jeffrey at nettuts@tutsplus.com.

Please note that actual compensation will be dependent upon the quality of the final tutorial and screencast.

Write a PLUS tutorial



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

Add Caching to a Data Access Layer

Posted by admin | Ajax&Js | Wednesday 13 January 2010 8:35 am

Dynamic web pages are great; you can adapt the resulting page to your user, show other user’s activity, offer different products to your customers based on their navigation history, and so on. But the more dynamic a website is, the more database queries you’ll probably need to perform. Unfortunately, these database queries consume the largest portion of your running time.

In this tutorial, I will demonstrate a way to improve performance, without running extra unnecessary queries. We’ll develop a query caching system for our data layer with small programming and deployment cost.

1. The Data Access Layer

Adding a caching layer transparently to an application is often difficult because of the internal design. With object oriented languages (like PHP 5) it is a lot easier, but it still can be complicated by poor design.

In this tutorial, we set our starting point in an application that performs all its database access through a centralized class from which all data models inherit the basic database access methods. The skeleton for this starting class looks like this:

class model_Model {

    protected static $DB = null;

    function __construct () {}

    protected function doStatement ($query) {}

    protected function quoteString ($value) {}
}

Let’s implement it step by step. First, the constructor that will use the PDO library to interface with the database:

    function __construct () {

        // connect to the DB if needed
        if ( is_null(self::$DB) ) {

            $dsn = app_AppConfig::getDSN();
            $db_user = app_AppConfig::getDBUser();
            $db_pass = app_AppConfig::getDBPassword();

            self::$DB = new PDO($dsn, $db_user, $db_pass);

            self::$DB->setAttribute( PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION );
        }
    }

We connect to the database using the PDO library. For the database credentials I use a static class named “app_AppConfig” that centralizes the application’s configuration information.

To store the database connection, we use a static attribute ($DB). We use a static attribute in order to share the same connection with all the instances of “model_Model”, and, because of that, the connection code is protected with an if (we don’t want to connect more than once).

In the last line of the constructor we set the exception error model for PDO. In this model, for every error the PDO finds, it throws an exception (class PDOException) instead of returning error values. This is a matter of taste, but the rest of the code can be kept cleaner with the exceptional model, which is good for this tutorial.

Executing queries can be very complex, but in this class we have taken a simple approach with a single doStatement() method:

    protected function doStatement ($query) {
        $st = self::$DB->query($query);
        if ( $st->columnCount()>0 ) {
            return $st->fetchAll(PDO::FETCH_ASSOC);
        } else {
            return array();
        }
    }

This method executes the query, and returns an associative array with the entire result set (if any). Note that we are using the static connection (self::$DB). Note, also, that this method is protected. This is because we don’t want the user to execute arbitrary queries. Instead of that we will provide concrete models to the user. We will see this later, but before let’s implement the last method:

    protected function quoteString ($value) {
        return self::$DB->quote($value,PDO::PARAM_STR);
    }

The “model_Model” class is a very simple but convenient class for data layering. Although it’s simple (it can be enhanced with advanced features like prepared statements if you want), it does the basic stuff for us.

To complete the configuration part of our application, let’s write the “app_Config” static class:

class app_AppConfig {

    static public function getDSN () {
        return "mysql:host=localhost;dbname=test";
    }

    static public function getDbUser ()  {
        return "test";
    }

    static public function getDbPassword () {
        return "MyTest";
    }
}

As stated before, we will provide concrete models to access the database. As a little example, we will use this simple schema: a documents table and an inverted index to search whether a document contains a given word or not:

CREATE TABLE documents (
    id        integer primary key,
    owner    varchar(40) not null,
    server_location    varchar(250) not null
);

CREATE TABLE words (
    word        char(30),
    doc_id    integer not null references documents(id),

    PRIMARY KEY (word,doc_id)
)

From the basic data access class (model_Model), we derive as many classes as needed by the data design of our application. In this example, we can derive those two self-explanatory classes:

class model_Index extends model_Model {

    public function getWord ($word) {
        return $this->doStatement("SELECT doc_id FROM words WHERE word=" . $this->quoteString($word));
    }
}

class model_Documents extends model_Model {

    public function get ($id) {
        return $this->doStatement( "SELECT * FROM documents WHERE id=" . intval($id) );
    }
}

Those derived models is where we add the public information. Using them is extremely simple:

$index = new model_Index();
$words = $index->getWord("coche");
var_dump($words);

The result for this example might look similar to that (obviously it depends on your actual data):

array(119) {
  [0]=>
  array(1) {
    ["doc_id"]=>
    string(4) "4630"
  }
  [1]=>
  array(1) {
    ["doc_id"]=>
    string(4) "4635"
  }
  [2]=>
  array(1) {
    ["doc_id"]=>
    string(4) "4873"
  }
  [3]=>
  array(1) {
    ["doc_id"]=>
    string(4) "4922"
  }
  [4]=>
  array(1) {
    ["doc_id"]=>
    string(4) "5373"
  }
...

What we have written is shown in the next UML class diagram:

2. Planning our Caching Scheme

When things start to collapse in your database server, it is time to take break and consider optimizing the data layer. After having optimized your queries, adding the proper indexes, etc., the second move is to try to avoid unnecessary queries: why make the same request to the database on every user request, if this data hardly changes?

With a well-planned and well-decoupled class organization, we can add an extra layer to our application almost with no programming cost. In this case, we are going to extend the “model_Model” class to add transparent caching to our database layer.

The Caching Basics

Since we know that we need a caching system, let’s focus on that particular problem and, once sorted out, we will integrate it in our data model. For now, we won’t think in terms of SQL queries. It’s easy to abstract a little and build a general enough scheme.

The simplest caching scheme consist of [key,data] pairs, where the key identifies the actual data we want to store. This schema is not new, in fact, it is analogous to PHP’s associative arrays, and we use it all the time.

So we will need a way to store a pair, to read it, and to delete it. That’s enough to build our interface for cache helpers:

interface cache_CacheHelper {

    function get ($key);

    function put ($key,$data);

    function delete ($key);
}

The interface is quite easy: the get method gets a value, given its identifying key, the put method sets (or updates) the value for a given key, and the delete method deletes it.

With this interface in mind, it’s time to implement our first real caching module. But before doing it, we will choose the data storage method.

The Underlying Storage System

The decision to build a common interface (like cache_CacheHelper) for caching helpers will allow us to implement them nearly on top of every storage. But on top on what storage system? There are a lot of them we can use: shared memory, files, memcached servers or even SQLite databases.

Often underestimated, DBM files are perfect for our caching system, and we are going to use them in this tutorial.

DBM files work naively on (key,data) pairs, and do it very fast due to its internal B-tree organization. They also do the access control for us: we don’t need to worry about blocking the cache before writing (like we will have to do on other storage systems); DBM does it for us.

DBM files are not driven by expensive servers, they do their work inside a lightweight library on the client side accessing locally to the actual file that stores the data. In fact they actually are a family of file formats, all of them with the same basic API for (key,data) access. Some of them allow repeated keys, others are constant and don’t allow writes after closing the file for the first time (cdb), etc. You can read more about that on http://www.php.net/manual/en/dba.requirements.php

Nearly every UNIX system install one type or more of these libraries (probably Berkeley DB or GNU dbm). For this example, we will use “db4″ format (Sleepycat DB4 format: http://www.sleepycat.com). I have found that this library is often preinstalled, but you can use whichever library you want (except cdb, of course: we want to write on the file). In fact you could move this decision into the “app_AppConfig” class and adapt it for every project you do.

With PHP, we have two alternatives to deal with DBM files: the “dba” extension (http://php.net/manual/en/book.dba.php) or the “PEAR::DBA” module (http://pear.php.net/package/DBA). We will use the “dba” extension, which probably you already have installed in your system.

Wait a minute, we are dealing with SQL and result sets!

DBM files work with strings for key and values, but our problem is to store SQL result sets (that can vary in structure quite a lot). How could we manage to convert them from one world to the other?

Well, for keys, it is very easy because the actual SQL query string identifies a set of data very well. We can use the MD5 digest of the query string to shorten the key. For values, it is trickier, but here your allies are the serialize() / unserialize() PHP functions, which can be used to convert from arrays to string and vice versa.

We will see how all this works in the next section.

3. Static Caching

In our first example, we will deal with the easiest way to perform caching: caching for static values. We will write a class called “cache_DBM” implementing the interface “cache_CacheHelper”, just like that:

class cache_DBM implements cache_CacheHelper {
    protected $dbm = null;

    function __construct ( $cache_file = null ) {
        $this->dbm = dba_popen($cache_file, "c", "db4"); 

        if ( !$this->dbm ) {
            throw new Exception("$cache_file: Cannot open cache file");
        }
    }

    function get ($key) {
        $data = dba_fetch($key, $this->dbm);
        if ( $data !== false ) {
            return $data;
        }
        return null;
    }

    function put ($key,$data) {
        if ( ! dba_replace($key, $data, $this->dbm) ) {
            throw new Exception("$key: Couldn't store");
        }
    }

    function delete ($key) {
        if ( ! dba_delete($key, $this->dbm) ) {
            throw new Exception("$key: Couldn't delete");
        }
    }
}

This class is very easy: a mapping between our interface and dba functions. In the constructor, the given file is opened, /> and the returned handler is stored in the object in order to use it in the other methods.

A simple example of use:

$cache = new cache_DBM( "/tmp/my_first_cache.dbm" );
$cache->put("key1", "my first value");
echo $cache->get("key1");

$cache->delete("key1");
$data = $cache->get("key1");
if ( is_null($data) ) {
    echo "\nCorrectly deleted!";
}

Below, you’ll find what we have done here expressed as an UML class diagram:

Now let’s add the caching system to our data model. We could have changed the “model_Model” class in order to add caching to each of its derived classes. But, if we had done so, we would have lost the flexibility to assign the caching characteristic only to specific models, and I think this is an important part of our job.

So we will create another class, called “model_StaticCache”, which will extend “model_Model” and will add caching functionality. Let’s start with the skeleton:

class model_StaticCache extends model_Model {

    protected static $cache = array();
    protected $model_name = null;

    function __construct () { }

    protected function doStatement ($query) { }
}

In the constructor, we first call the parent constructor in order to connect to the database. Then, we create and store, statically, a “cache_DBM” object (if not created before elsewhere). We store one instance for every derived class name because we are using one DBM file for every one of them. For that purpose, we use the static array “$cache”.

    function __construct () {
        parent::__construct();

        $this->model_name = get_class($this);
        if ( ! isset( self::$cache[$this->model_name] ) ) {
            $cache_dir = app_AppConfig::getCacheDir();
            self::$cache[$this->model_name] = new cache_DBM( $cache_dir . $this->model_name);
        }
   }

To determine in which directory we have to write the cache files, we have used again the application’s configuration class: “app_AppConfig”.

And now: the doStatement() method. The logic for this method is: convert the SQL statement to a valid key, search the key in the cache, if found return the value. If not found, execute it in the database, store the result and return it:

    protected function doStatement ($query) {
        $key = md5($query);

        $data = self::$cache[$this->model_name]->get($key);
        if ( ! is_null($data) ) {
            return unserialize($data);
        }

        $data = parent::doStatement($query);

        self::$cache[$this->model_name]->put($key,serialize($data));

        return $data;
    }

There are two more things worth noting. First, we are using the MD5 of the query as the key. In fact, it is not necessary, because the underlying DBM library accepts keys of arbitrary size, but it seems better to shorten the key anyway. If you are using prepared statements, remember to concatenate the actual values to the query string to create the key!

Once the “model_StaticCache” is created, modifying a concrete model for its use is trivial, you only need to change its “extends” clause in the class declaration:

class model_Documents extends model_StaticCache {
}

And that’s all, the magic is done! The “model_Document” will perform only one query for every document to retrieve. But we can do it better.

4. Caching Expiration

In our first approach, once a query is stored in the cache, it remains valid forever until two things occur: we delete its key explicitly, or we unlink the DBM file.

However this approach is only valid for a few data models of our application: the static data (like menu options and this kind of things). The normal data in our application is likely to be more dynamic than that.

Think about a table containing the products we sell in our web page. It is not likely to change every minute, but there is the chance that this data will change (by adding new products, changing selling prices, etc.). We need a way to implement caching, but have a way to react to changes in data.

One approach to this problem is to set an expiration time to the data stored in the cache. When we store new data in the cache, we set a window of time in which this data will be valid. After that time, the data will be read from the database again and stored into the cache for another period of time.

As before, we can create another derived class from “model_Model” with this functionality. This time, we will call it “model_ExpiringCache”. The skeleton is similar to “model_StaticCache”:

class model_ExpiringCache extends model_Model {

    protected static $cache = array();
    protected $model_name = null;
    protected $expiration = 0;

    function __construct () { }

    protected function doStatement ($query) { }
}

In this class we have introduced a new attribute: $expiration. This one will store the configured time window for valid data. We set this value in the constructor, the rest of the constructor is the same as in “model_StaticCache”:

    function __construct () {
        parent::__construct();

        $this->model_name = get_class($this);
        if ( ! isset( self::$cache[$this->model_name] ) ) {
            $cache_dir = app_AppConfig::getCacheDir();
            self::$cache[$this->model_name] = new cache_DBM( $cache_dir . $this->model_name);
        }

        $this->expiration = 3600;   // 1 hour
   }

The bulk of the job comes in the doStatement. The DBM files have no internal way to control expiration of data, so we must implement our own. We’ll do it by storing arrays, like this one:

array(
        "time" => 1250443188,
        "data" => (the actual data)
)

This kind of array is what we serialize, and store into the cache. The “time,” key is the modification time of the data in the cache, and the “data” is the actual data we want to store. On read time, if we find that the key exists, we compare the creation time stored with the current time and return the data if not expired.

    protected function doStatement ($query) {
        $key = md5($query);
        $now = time();

        $data = self::$cache[$this->model_name]->get($key);
        if ( !is_null($data) ) {
            $data = unserialize($data);
            if ( $data['time'] + $this->expiration > $now ) {
                return $data['data'];
            }
        }

If the key doesn’t exist or is expired, we continue executing the query and storing the new result set in the cache before returning it.

        $data = parent::doStatement($query);

        self::$cache[$this->model_name]->put( $key,
                serialize( array("data"=>$data,"time"=>$now) ) );

        return $data;
    }

Simple!

Now let’s convert the “model_Index” to a model with expiring cache. As it happens, with “model_Documents,” we only need to modify the class declaration and change the “extends” clause:

class model_Documents extends model_ExpiringCache {
}

About the expiration time… some considerations must be made. We use a constant expiration time (1 hour = 3,600 seconds), for the sake of simplicity, and because we don’t want to modify the rest of our code. But, we can easily modify it in a lot of ways to allow us to use different expiration times, one for each model. Afterward we will see how.

The class diagram for all our job is as follows:

5. Different Expiration

In every project, I am sure you will have different expiration time for nearly every model: from a couple of minutes to hours, or even days.

If only we could have a different expiration time for every model, it would be perfect… but, wait! We can do it easily!

The most direct approach is to add an argument to the constructor, so the new constructor for “model_ExpiringCache” will be this one:

    function __construct ( $expiration=3600 ) {
        parent::__construct();

        $this->expiration = $expiration;
        ...
    }

Then, if we want a model with a 1 day expiration time (1 day = 24 hours = 1,440 minutes = 86,400 seconds), we can accomplish it this way:

class model_Index extends model_ExpiringCache {
    function __construct() {
        parent::__construct(86400);
    }

   ...
}

And that’s all. However, the drawback is that we must modify all the data models.

Another way of doing it is to delegate the task to the “app_AppConfig”:

class app_AppConfig {
    ...
    public static function getExpirationTime ($model_name) {
        switch ( $model_name ) {
            case "model_Index":
                return 86400;
            ...
            default:
                return 3600;
        }
    }
}

And then add the call to this new method on the “model_ExpiringCache” constructor, like this:

    function __construct () {
        parent::__construct();

        $this->model_name = get_class($this);

        $this->expiration = app_AppConfig::getExpirationTime($this->model_name);

        ...
    }

This latest method allows us to do fancy things, like use different expiration values for production or development environments in a more centralized way. Anyway, you can choose yours.

In UML, the total project looks like this:

6. Some Caveats

There are some queries that cannot be cached. The most evident ones are modifying queries like INSERT, DELETE or UPDATE. These queries must arrive to the database server.

But even with SELECT queries, there are some circumstances in which a caching system can creaet problems. Take a look at a query like this one:

SELECT * FROM banners WHERE zone='home' ORDER BY rand() LIMIT 10

This query selects randomly 10 banners for the “home” zone of our website. This is intended to generate movement in the banners shown in our home, but if we cache this query, the user will not see any movement at all, until the cached data expires.

The rand() function is not deterministic (as it is not now() or others); so it will return a different value on every execution. If we cache it, we will freeze only one of those results for all the caching period, and therefore breaking the functionality.

But with a simple re-factoring, we can obtain the benefits of caching and show pseudo-randomness:

class model_Banners extends model_ExpiringCache {

    public function getRandom ($zone) {
        $random_number = rand(1,50);
        $banners = $this->doStatement( "SELECT * FROM banners WHERE zone=" .
                $this->quoteString($zone) .
                " AND $random_number = $random_number ORDER BY rand() LIMIT 10" );
        return $banners;
    }
...
}

What we are doing here is to cache fifty different random banners configurations, and select them randomly. The 50 SELECT’s will look like this:

SELECT * FROM banners WHERE zone='home' AND 1=1 ORDER BY rand() LIMIT 10
SELECT * FROM banners WHERE zone='home' AND 2=2 ORDER BY rand() LIMIT 10
...
SELECT * FROM banners WHERE zone='home' AND 50=50 ORDER BY rand() LIMIT 10

We have added a constant condition to the select, which has no cost to the database server but renders 50 different keys for the caching system. A user will need to load the page fifty times to see all the banner’s different configurations; so the dynamic effect is achieved. The cost is fifty queries to the database to fetch the cache.

7. A Benchmark

What benefits can we expect from our new caching system?

First, it must be said that, in raw performance, sometimes our new implementation will run slower than database queries, specially with very simple, well-optimized queries. But for those queries with joins, our DBM cache will run faster.

However, the problem we solved is not raw performance. You will never have a spare database server for your tests in production. You’ll probably have a server with high workloads. In this situation, even the fastest query can run slowly, but with our caching scheme, we are not even using the server, and, in fact, we are reducing its workload. So the real performance increase will come in the form of more petitions per second served.

In a website that I am currently developing, I have done a simple benchmark to understand the benefits of caching. The server is modest: it runs Ubuntu 8.10 running on top of an AMD Athlon 64 X2 5600+, with 2 GB of RAM and an old PATA hard disk. The system runs Apahce and MySQL 5.0, that comes with the Ubuntu distribution without any tuning.

The test was to run Apache’s benchmark program (ab) with 1, 5 and 10 concurrent clients loading a page 1,000 times from my development website. The actual page was a product detail that has no less than 20 queries: menu contents, product details, recommended products, banners, etc.

The results without cache were 4.35 p/s for 1 client, 8.25 for 5 clients, and 8.29 for 10 clients. With caching (different expiration), the results were 25.55 p/s with 1 client, 49.01 for 5 clients, and 48.74 for 10 clients.

Final Thoughts

I’ve shown you an easy way to insert caching into your data model. Of course, there are a plethora of alternatives, but this one is but one choice that you have.

We have used local DBM files to store the data, but there are even faster alternatives that you might consider exploring. Some ideas for the future: using APC’s apc_store() functions as underlying storage system, shared memory for the really critical data, using memcached, etc.

I hope you have enjoyed this tutorial as much as I did writing it. Happy caching!



[Post to Twitter] Tweet This Post  [Post to Plurk] Plurk This Post  [Post to Yahoo Buzz] Buzz This Post  [Post to Delicious] Delicious This Post  [Post to Digg] Digg This Post  [Post to Ping.fm] Ping This Post  [Post to Reddit] Reddit This Post  [Post to StumbleUpon] Stumble This Post 

Next Page »
atriumax wordpress theme