Random URLs and How to Create Them in ASP.Net

Introduction

This article is about random URLs that help the application owners create small URLs that can be saved in databases or ed on to the next application as a short text instead of long, boring URLs.

Most of the companies use this technique to provide short links to their resource file or any other resource file the user would like to add, such as Twitter, Facebook and Tumblr and so on. They allow the users to send a link to a document that might be present at:

http://www.example.com/post/category/postId/some-other-text-from-post

... to just the following link, as:

http://www.examp.le/pSij284Fj

Now let us understand the concept of this and how to implement this feature in our ASP.NET websites. But remember, this is a pure C# code and none of the code from ASP.NET namespace or framework was used. That means you can use it in your .NET applications such as WPF, Windows Forms and so on too.

Longer URLs: The Problem

An URL is a link to a specific document present on the internet that a user can access anytime using any software that can access the internet service. Sometimes, these URLs are long enough to irritate the users, such as the ones that contain all of the information appended to the URL itself. Take an example of the following URL:

http://www.example.com/page/category/sub-category/month-date/some-general-fancy-text

Now this is really a long text and most of the URLs are even longer than this one and span over more than 500 characters. Oh well, browsers don't like these longer URLs and sometimes Internet Explorer would have trouble with mailto: URLs with more than 500 characters.

Not only do browsers hate these long URLs, even the search engines hate these longer URLs and it is also noted that Google doesn't even care to index URLs more than 1855 characters or so. - Reference.

So it's not just we, the users, who hate these longer URLs that are either not visit-able again because we don't remember them at all after closing the tab or should go to the History to check them back, but it is the software too that prevent usage of these many characters in the URLs.

Why it is not preferred

There is usually more than one reason to not want to use these longer URLs. But everyone would have their own pet peeve and so do I have. I do not like to fill up my URL bar with much data, I prefer to get only the website's domain and the location in the sitemap where I am at, the page only.

Somehow, it is also good to send shorter URLs than larger ones. Each character in the URL takes up a byte of data (a char's size is 1 byte) and might be 2 bytes if you're using Unicode (note that a character is always Unicode in the .NET framework and will always use 2 bytes of data to represent a Unicode character). The character overview in .NET can be read here.

For a good networking and for smooth transferring of data between computers and devices it is better to send less overhead data so that the actual message can is reached faster. 2048 characters with 2 bytes per character would be nearly 4KB of data just to identify the resource. 4KB of data would be added to the overhead section in the data packet created to be transferred into the network stream and will have no use but just the address to identify the web resource the user would use, or the data must be sent at. For a high-speed internet user this would'nt be a problem, but for mobile users and other users with a slow speed internet data package it will be a great problem because they first need to just send the data of the resource and then anything else can be done and so on.

What might be done

Usually, shortening the URLs is not done and the long URLs are sent but they must be minimized, for that most of the companies, especially the 140-charactered blogging website "Twitter" needs this feature, where they would store the long URL in their system and will show just that short URL embedded in their Tweet. This way, any URL, no matter how long it is, would be just a few characters long, 15 characters on Twitter and for others may be less such as Goo.gl (Google URL shortener) and fb.me (Facebook URL for short URLs) and so on companies.

This enables people to write short URLs such as, http://exm.pl/AfzaalBlog (that would redirect to my blog) that the user would be able to remember. This URL is better, because:

  1. It is short.

  2. It will use less data when transferring data.

  3. It is easy to remember.

  4. It is more semantic and makes sense of where the user actually is going to, more like a 2 - 4 word sentence.

These are a few reasons why people should use short URLs, not only to send the data quickly (usually short URLs and long URLs don't seem to be any different on fast internet connections but make a huge difference once on slow connections).

Random URLs

If what was said previously, that "short URLs are better" is correct, then why do people even want to use Random URLs? Which even causes a havoc for them to create a random URL for each and every visitor and so on.

Why to use Random URLs

A short URL is a better URL, the preceding stated stats explain it. But usually, there are many users that might want you to create a URL for all of their resource files, that would take a time for the user to fill in the form to get a ShortURL on your server and then sign a petition to agree to the terms and take that URL in his hands to use it as a short URL. Well, that is a long process and includes a little exaggeration too.

For that, a website might want to create a set of functions that create an alphanumeric string that is Random, in other words that it generates for each and every user for each and every attempt by him to get a new URL. It remains Random by checking against any other previously created such string that might be similar to this one.

The number of the URLs created by these alphanumeric text representation depends on the number of total characters allowed in the URL. You might understand the concept of combination and permutation in which the characters and all of their partner characters are aligned in the string such that they always represent something new. For example, a simple string of 10 characters that can hold 9 numbers, can grow up to, oh well I am not so good with Mathematics but I can tell you that it will be enough to capture a huge company's documents resources without duplicating any link at all.

You can make this even stronger by using lower and capital case characters in your URL string too. This way, it will make your URLs 3 times stronger and the mixture would be stronger enough to contain enough URLs that none of your users would complain for getting an error such as, "Sorry all out of URLs".

Making a stronger mixture, a Really Random URL

Since a Random URL needs to be random and the intent is to generate short URLs that do not span more than 7 - 10 characters, the real thing is to make these short URLs random in real life too and not just a string that is used in the URLs.

In this example, I have added two special characters to make the random-ness a little powerful, similarly the URL must be sufficiently random that in the near future or even furhter future the URL generation is kept sufficiently random to minimize any possibility of duplication. A duplication in URL generation will cause a Stop in the application because no more short URLs will be generated since the application's main logic to generate URLs will fail.

In the preceding logic, if you pay attention to the condition where the character or number value selection is being made, you can see that numbers will be selected only as a ratio of 1 : 3, 25% numbers and 75% chance of a character. Why did I do that in this, how does that affect the URL in reality?

Actually the string we're going to create is a set of permutations of the 26 letters used as characters and not only those 26, two more special characters (-, _) that will add a little less chance of no-duplication to the logic. Now there is another list that contains the numbers. Mixing them up to create a Random URL. But for how long? That depends on the mixture, ever saw a Chemist working? He chooses the mixtures in a specific amount to create a compound that is a perfectly stable one.

Similarly, adding a few numbes to the character set will create a good result and a huge number of random URLs that will be generated. A statistics or mathematics student would help us find a good permutation of these lists, helping us to minimize any chance of duplication in our application. If the numbers are increased to be 50% then there will be a 50/50 chance of the numbers and characters making it more vulnerable to be duplicated soon. Similarly, 0% of numbers (or a rare chance of them) in the application would also cause the characters to start the duplication after (very many) URLs being added to the database and your application logic would need to be changed or updated with a stronger one.

One last thing is to make another layer of removing chances of duplication is to check the case-sensitivity of the string. For example B is not equal to b. You will be checking each character of the string, to match the character of the other strings to match it. Or in another way, you can use the .Equals() method that is available in all objects (because it is inherited from System.Object) to check whether two strings are equal in characters and their case too. For example the following code:

  1. // creating the variables  
  2. string a = "Love for all Hatred for none!";  
  3. string b = "LOVE FOR ALL HATRED FOR NONE!";  
  4. string c = "love for all hatred for none!";  
  5. string d = "Love for all Hatred for none!";  
  6.           
  7. // testing the objects, whether they're equal or not.  
  8. if(a.Equals(b)) {  
  9.    Console.WriteLine("a matches b.");  
  10. else {  
  11.    Console.WriteLine("a doesn't match b");  
  12. }  
  13. if(a.Equals(c)) {  
  14.    Console.WriteLine("a matches c");  
  15. else {  
  16.    Console.WriteLine("a doesn't match c");  
  17. }  
  18. if(a.Equals(d)) {  
  19.    Console.WriteLine("a matches d");  
  20. else {  
  21.    Console.WriteLine("a doesn't match d");  
  22. }  
  23.   
  24. // Output of the program  
  25. // a doesn't match b  
  26. // a doesn't match c  
  27. // a matches d  

 

Shows an example of how to check for the case-sensitive strings to create another layer of security against duplication. Fiddler here can be used to test.

Generating Random URLs in ASP.NET

ASP.NET allows you to create a random string using the GUID that you can use to create a long unique string that is claimed to "never duplicate". But, you can also create a short snippet of code that will continually give you a string with a random set and sequence of characters.

Code for function

The first step is to generate a random URL that is dependent on the letters (characters) and the numeric data (numbers). So what we will be doing is to create two lists, one would contain all of the character data for our URL and the other would contain all of the numbers (0-9).

In the characters, I have added "-" and "_" just to add a greater chance of no duplication. You can add anything else that you want to be valid in your URL and it will add an extra layer of a unduplicated URL in your applications.

The second thing is that you will need to generate the Random numbers, you can only generate integer values. Characters cannot be created using the Random class. So we will be making use of Indexers that let us select any one element from a collection, A List is a collection of elements of type T and then we will concatenate it to the string of the actual URL that we're going to write in the stream, or anywhere else.

  1. // List of characters and numbers to be used...  
  2. string URL = "";  
  3. List<int> numbers = new List<int>() {1, 2, 3, 4, 5, 6, 7, 8, 9, 0};  
  4. List<char> characters = new List<char>()   
  5. {'a''b''c''d''e''f''g''h''i''j''k''l''m''n''o''p''q''r''s''t''u''v''w''x''y''z''A''B''C''D''E''F''G''H''I''J''K''L''M''N''O''P''Q''R''S''T''U''V''W''X''Y''Z''-''_'};  
  6.   
  7. // Create one instance of the Random  
  8. Random rand = new Random();  
  9. // run the loop till I get a string of 10 characters  
  10. for (int i = 0; i < 11; i++) {  
  11.     // Get random numbers, to get either a character or a number...  
  12.     int random = rand.Next(0, 3);  
  13.     if(random == 1) {  
  14.         // use a number  
  15.         random = rand.Next(0, numbers.Count);  
  16.         URL += numbers[random].ToString();  
  17.     } else {  
  18.         // Use a character  
  19.         random = rand.Next(0, characters.Count);  
  20.         URL += characters[random].ToString();  
  21.     }  
  22. }  

Using a separate Class

Remember to write this function inside a separate class from where you will be calling it, you can even specify it to have only one member, that is the GetURL() method and make it static so that no one can create an instance of it and so on.

This will minimize any chances of code repetition, which is off-topic on this article, in your code that is against the rules of programming. If you would create a class, the class would look like this:

  1. // Required namespaces  
  2. using System;  
  3. using System.Collections.Generic;  
  4. using System.Web;  
  5.   
  6. /// <summary>  
  7. /// RandomURL class generates Random URLs for applications.  
  8. /// </summary>  
  9.   
  10. public class RandomURL  
  11. {  
  12.     // List of characters and numbers to be used...  
  13.     private static List<int> numbers = new List<int>() {1, 2, 3, 4, 5, 6, 7, 8, 9, 0};  
  14.     private static List<char> characters = new List<char>()   
  15.     {'a''b''c''d''e''f''g''h''i''j''k''l''m''n',   
  16.     'o''p''q''r''s''t''u''v''w''x''y''z''A''B',   
  17.     'C''D''E''F''G''H''I''J''K''L''M''N''O''P',   
  18.     'Q''R''S',  'T''U''V''W''X''Y''Z''-''_'};  
  19.   
  20.     public static string GetURL () {      
  21.         string URL = "";  
  22.         Random rand = new Random();  
  23.         // run the loop till I get a string of 10 characters  
  24.         for (int i = 0; i < 11; i++) {  
  25.             // Get random numbers, to get either a character or a number...  
  26.             int random = rand.Next(0, 3);  
  27.             if(random == 1) {  
  28.                 // use a number  
  29.                 random = rand.Next(0, numbers.Count);  
  30.                 URL += numbers[random].ToString();  
  31.             } else {  
  32.                 random = rand.Next(0, characters.Count);  
  33.                 URL += characters[random].ToString();  
  34.             }  
  35.         }  
  36.         return URL;  
  37.     }  
  38. }  

 

Note that now the members (method and both of the Lists) are now static. Since we're never going to change the value again unless we're going to re-run the same code, or putting it simply, there won't be multiple instances of the same class at a time, so we would make them statics and use them without creating an instance of it each time it is used.

Result of this code

I ran the same code, in the function phase as well as in the class phase and it worked! The result of my ASP.NET website's web page, that showed the URL generated, was like the following:

url generated

Each attempt of running the code would provide you a Random URL that you can use to send in short messages or perform other tasks in applications.

Saving and Re-using the Original URL

This topic was started in comments: "How to get the original URL back for re-use". Well, it was a good topic to talk about. And it forced me to change the entire project and add a few more code blocks to it and the feature to get the original URL from the database and use the Redirect feature to do a practical example of this article too.

Saving the URL

First stage is to store the URL in the database, you can design your database to store the URL and the short URL that you're going to use, in this process. The following schema can be used, please understand that the ID is not required in this method.

table randomurl

Som important things to note here are as follows:

    1. A long URL's size is set to 4000 max, because this is as long as it can be.
    2. A short URL is set to 50, because it would be a short URL size and its size must be short.
    3. ID is not required, as always said.

Now you can simply just write the back-end code to save the data and generate a new URL to be associated with the long URL in your application.

Note: I didn't change the class or the C# code, I just updated the UI and the logic of the application to save the URLs and create a new page to redirect the user to the main URL that was shortened by the application. So do not confused about whether to use the previously posted code or not. This one is a new feature.

Create an HTML form in the web page, to accept the URLs from the User.

  1. <form method="post">  
  2.     <input type="text" name="url" />  
  3.     <input type="submit" value="Submit" />  
  4. </form>  

That is enough, now do the server-side coding that would save the URL to the database.

  1. string URL = "";  
  2. string longUrl = "";  
  3. if(IsPost) {  
  4.     longUrl = Request.Form["url"];  
  5.     // Change the UI of the web page.  
  6.     // Open the connection  
  7.     var db = Database.Open("StarterSite");  
  8.   
  9.     // Get the URL  
  10.     URL = RandomURL.GetURL();  
  11.     if(db.Query("SELECT * FROM RandomUrls WHERE ShortUrl = @0", URL).Count() > 0) {  
  12.         // Generate a new URL because the previous one had a match.  
  13.         URL = RandomURL.GetURL();  
  14.     }  
  15.   
  16.     // Now the URL is unique, so save it...  
  17.     db.Execute("INSERT INTO RandomUrls (UrlString, ShortUrl) VALUES (@0, @1)", longUrl, URL);  
  18. }  

A little explanation about the preceding code is that it will run when the Request is POSTed. If it is, it will capture the URL that was posted by the user in the form and will look into the database, if the ShortURL generated has a match, it will create a new RandomURL (it won't be done usually, but very often ) and then it will save both of the URLs in the database.

Once executed, it will save the URLs like this:

urlid short

You can attach your own conditions with it too so that the long URL is also checked to exist in the database and so on.

Extracting the original URL and redirecting the user

The main part of this process is the extraction of the real URL to which the user will be redirected to. It is as simple as a query to the database, first of all create a new ASP.NET page, where you can write the code to query the database. I queried the database and wrote the list of the URLs like this:

enter url

You have saved the ShortURL string that was created in your database that is associated with the long URL inside the database. You can that ShortURL to the page and perform other different tasks on it, in this article the task of redirection will happen. Write this code in your new ASP.NET page:

  1. // Get the short url  
  2. var shortUrl = UrlData[0];  
  3.   
  4. // find it in the database  
  5. var db = Database.Open("StarterSite");  
  6. var found = db.Query("SELECT * FROM RandomUrls WHERE ShortUrl = @0", shortUrl).Count() > 0;  
  7. if(found) {  
  8.     Response.Redirect(db.Query  
  9.     ("SELECT * FROM RandomUrls WHERE ShortUrl = @0", shortUrl).First().UrlString);  
  10. else {  
  11.     Response.Redirect("~/");  
  12. }  

The preceding code will look into the database and will find the actual long URL that was shortened. I named the file to be redirected to so in my application the URL was ed as:

local host

Since this page was connected to Google, the result was this page:

google

This way, you can simply just redirect the user to the actual web page's URL that was long and was shortened.

Tip: Why use only one instance of Random

I found it a bit interesting to share what I said in the article and in the following Points of Interest section too, to "use only one Random instance and use the .Next() method to create a new random number". The important key point to note is that these random numbers are generated using a Seed, if you that Seed to the constructor of the object, it will use it to create a Random number. Otherwise, it will use your systems's current time for the Seed and since the code is executed at the same time, not to set any debugging features on, or any other fancy hack, all of the instances that you use would contain the same random number.

To minimize this error, you should use one random number and keep changing its value using the .Next method and the value range you want to get the random number in between.

Points of Interest

Creating a simple class that would generate these Random URLs would be easy, it will also prevent the violation of "Repeating the code" rule in your application.

Creating only one Random instance would be better; once created, keep using the .Next() method to create a new random number for that instance in each and every function.