File Tampering Detection in VB.NET

Introduction:

This article describes an easy approach to determining whether or not two files are exactly the same; the purpose of this test being to determine whether or not a file has been edited or tampered with in any way by comparing a file against an original. The code and sample application demonstrate two methods for determining the status of the file. 

The approach indicated is recommended by Microsoft and mention of it was made in Matthew MacDonald's Visual Basic .NET book published by Microsoft Press; I have found the approach useful in determining whether or not a file has been altered by comparing that suspect file against the original.

FileTamperTest1.gif

Figure 1. The Sample Application in Use

Getting Started

In order to get started, unzip the included project and open the solution in the Visual Studio 2005 environment. In the solution explorer, you should note the following:

FileTamperTest2.gif

Figure 2. Solution Explorer

As you can see, there is only a single form contained in this Windows application project (frmMain.vb). There were no additional references or resources added to the project and only the default settings are necessary to support the code used.

The design of the form is simple, there are two sets of controls (a text box and a button) used in conjunction of an Open File Dialog to search for and load two files. One file is the source file, and the second is the file that will be compared against the source. Two additional buttons are added to the form and are used to kick off either of the two tests that will be run against the two selected files. Lastly, there is a button used to terminate the application:

FileTamperTest3.gif

Figure 3. The Main Form Designer

The Code: Main Form (frmMain.vb)

The main form class includes two imports which are necessary to support the sample application:

Imports System.Security.Cryptography

Imports System.IO

Cryptography exposes the Hash Algorithm class which allows the application to convert the content of a file stream or byte array into a hash algorithm which in turn may be used as the basis for a comparison between the target and selected file. This approach will be sensitive to even the most minor change (such as removing or adding a single space). 

IO is added to allow for the manipulation of the files themselves.

The first block of code in the application is used to terminate the application whenever the user clicks the "Exit" button:

Public Class frmMain

 

    Private Sub btnExit_Click(ByVal sender As System.Object, ByVal e As

    System.EventArgs) Handles btnExit.Click

        Application.Exit()

End Sub

Following the exit button click event handler, the next two code blocks are used to handle the click events for the browse buttons used on the form. Since the two handlers are roughly the same, I will only show one of them here:

Private Sub btnBrowseSrc_Click(ByVal sender As System.Object, ByVal e As

System.EventArgs) Handles btnBrowseSrc.Click

 

    OpenFileDialog1.Title = "Open File"

    OpenFileDialog1.Filter = "Files (*.*)|*.*"

 

    If OpenFileDialog1.ShowDialog = Windows.Forms.DialogResult.Cancel Then

        Exit Sub

    End If

 

    Dim sFilePath As String = OpenFileDialog1.FileName

 

    If System.IO.File.Exists(sFilePath) = False Then

        sFilePath = ""

        Exit Sub

    Else

        txtSourceFile.Text = sFilePath

    End If

 

End Sub

This is all pretty common, the Open File Dialog is configured to display the title "Open File" and the filter is set to display all files. If the user selects the cancel button, the subroutine will exit. When the user selects a file through the dialog, the subroutine checks to see if the file exists, and if it does, it sets the text property of the appropriate text box to display the path to the file.

The next block of code is used to execute the hash algorithm based test of the two selected files:

Private Sub btnTest_Click(ByVal sender As System.Object, ByVal e As

System.EventArgs) Handles btnTest.Click

 

    Dim myHash As HashAlgorithm

    myHash = HashAlgorithm.Create()

 

    If txtTestFile.Text = String.Empty Or Me.txtSourceFile.Text = String.Empty

    Then

        MessageBox.Show("Set all form fields prior to initiating a test", _

        "Missing Form Data", MessageBoxButtons.OK)

    End If

 

    Dim fs1 As New FileStream(txtTestFile.Text, FileMode.OpenOrCreate)

    Dim fs1Bytes As Byte() = New Byte(fs1.Length) {}

    fs1.Read(fs1Bytes, 0, fs1.Length)

    Dim arr1() As Byte = myHash.ComputeHash(fs1Bytes)

    fs1.Close()

 

    Dim fs2 As New FileStream(txtSourceFile.Text, FileMode.OpenOrCreate)

    Dim fs2Bytes As Byte() = New Byte(fs2.Length) {}

    fs2.Read(fs2Bytes, 0, fs2.Length)

    Dim arr2() As Byte = myHash.ComputeHash(fs2Bytes)

    fs2.Close()

 

    If BitConverter.ToString(arr1) = BitConverter.ToString(arr2) Then

        MessageBox.Show("The file examined has not been tampered with.", "Hash

        Test Passed")

 

        'display comparison

        MessageBox.Show("Original Hash: " & Environment.NewLine &

        BitConverter.ToString(arr1) & _

                         Environment.NewLine & _

                        "Test Hash: " & Environment.NewLine & _

                        BitConverter.ToString(arr2), "Hash Test Results")

    Else

        MessageBox.Show("The file examined has been tampered with.", "Hash Test

        Failed")

 

        'display comparison

        MessageBox.Show("Original Hash: " & Environment.NewLine &

        BitConverter.ToString(arr1) & _

                         Environment.NewLine & _

                        "Test Hash: " & Environment.NewLine & _

                        BitConverter.ToString(arr2), "Hash Test Results")

 

    End If

 

End Sub

The subroutine starts by creating an instance of the Hash Algorithm class called "myHash". Next, the subroutine validates that there is text contained in each of the two text boxes used to contain the paths to the source and test files to be used in the evaluation.

The next bit of code is as follows:

Dim fs1 As New FileStream(txtTestFile.Text, FileMode.OpenOrCreate)

Dim fs1Bytes As Byte() = New Byte(fs1.Length) {}

fs1.Read(fs1Bytes, 0, fs1.Length)

Dim arr1() As Byte = myHash.ComputeHash(fs1Bytes)

fs1.Close()

 

This code creates a file stream and passes the path to the test file and file mode to that file stream object. A byte array is created and set to the length of the file stream and then populated with the content of the file stream. A new byte array used to contain value returned from the hash algorithm's compute hash method is then created and passed the byte array generated directly from the file stream. Lastly, the file stream is closed. This same process is then applied to the source file in the next bit of code.


When the hash for each of the files has been generated, the subroutine then uses the System.BitConverter to compare to the two byte arrays. If the arrays are identical, the user is informed that the file has not been tampered with or changed, if they do not match, the user is informed of the mismatch and the two byte arrays are displayed to the user to confirm the difference between the two arrays. Any minor change to the files will result in a completely different hash.


The next subroutine is used to handle the Byte Test button click event; that code is as follows:

 

Private Sub btnByteCompare_Click(ByVal sender As System.Object, ByVal e As

System.EventArgs) Handles btnByteCompare.Click

 

    Dim fs1 As New FileStream(txtTestFile.Text, FileMode.OpenOrCreate)

    Dim fs1Bytes As Byte() = New Byte(fs1.Length) {}

    fs1.Read(fs1Bytes, 0, fs1.Length)

    fs1.Close()

 

    Dim fs2 As New FileStream(txtSourceFile.Text, FileMode.OpenOrCreate)

    Dim fs2Bytes As Byte() = New Byte(fs2.Length) {}

    fs2.Read(fs2Bytes, 0, fs2.Length)

    fs2.Close()

 

    Dim i As Integer = 0

    For i = 0 To fs1Bytes.Length - 1

        If Not fs1Bytes(i) = fs2Bytes(i) Then

            MessageBox.Show("The file examined has been tampered with at position " & _

            i.ToString(), "Byte Test Failed")

            Exit Sub

        End If

    Next

 

    MessageBox.Show("The file examined has not been tampered with.", "Byte Test

    Passed")

 

End Sub

This subroutine starts out by opening a file stream for each of the two files (source and test) and converts the content of the two files to byte arrays. Once this is done, the subroutine executes a loop to do a byte by byte comparison between the two files. If the files match from beginning to end, the user will be told that the file has not been tampered with; if the files do not match as any position in the byte array, the user will be told at what position the first mismatch occurred.

Testing the Application

To prepare for the test, create a file in notepad, type some text into it, and save it on the file system. Next, create an exact duplicate of the file. Use these two files as the source and test files used by the application.

Build and launch the application and use the browse buttons to load the two files created per the last paragraph. Once the two files have been set, click on the "Hash Test" button. You should see this result displayed:

FileTamperTest4.gif

Figure 4. Hash Test Results for Identical Files

FileTamperTest5.gif

Figure 5. Original and Test Hash Comparison

Dismiss the dialog boxes by clicking OK on each of them. Now click on the Byte Test button; the results displayed should match this example:

FileTamperTest6.gif

Figure 6. Byte Test Results for Two Identical Files

Now, open the duplicate file in notepad and edit one letter in the text. In the example, my text file contained the string shown in Figure 7. In that string, I replaced the "b" in boat with a "g" to turn boat into goat. Save the file and repeat the test.

FileTamperTest7.gif

Figure 7. Notepad with Sample Text

When the test is repeated, the results for the hash test will be as follows:

FileTamperTest8.gif

Figure 8. Hash Test Results after Edit of Test File

FileTamperTest9.gif

Figure 9. Different Hash for Original and Test Files After Edit of Test File

FileTamperTest10.gif
 
Figure 10. Byte Test Failure Pointing to Position of Mismatch

As can be seen from the results, the hash returned by the test file after making a single character change is entirely different from the original and the mismatch is easily detected by the comparison. Similarly, when performing the byte array test, the position of failure was easily trapped by making the byte by byte comparison of the two files. Position 82 in this case is the position where the "B" in boat was swapped for the "G" in goat.

Summary

This example was intended to show a couple of ways in which two files may be compared in order to determine whether or not they are identical. While this example only shows two approaches to testing the files, there are several variations to the approach that can be applied, for example, the hash algorithm class ComputeHash method will perform the same operation directly on the file stream without first converting it to byte array.