Generating SHA1 and MD5 hashes in .NET (C#), Java and Ruby

Posted by Adrian Fri, 26 Jun 2009 21:04:00 GMT

Hashes are a vital technique for validating data that has been passed between two systems. Quite often, the simplest implementation of a hash in one language won't necessarily produce the same hash-string as another language. This article gives you code snippets for three languages that produce identical hashes for any given data.

SHA-1 and MD5 are hashing algorithms. They read some content and produce a string that is relatively unique to that content.

The purpose of a hash is to identify a file as being consistent with your expectations. The two reasons I use a hash are: to ensure that a file I transmit (usually over a web service) is exactly the same as the file on the server, and also to create a ‘base-line’ hash of a downloaded file to compare against the same file at a later date (if the user has made changes the hash will change).

Note that a these hashes, while fairly robust, are not bullet proof. Two files can have the same hash despite being completely different content. These are called collisions, and are extremely rare (at least when considering collisions that happen by chance). Be aware that malicious users can purposefully create matching hashes on different files with relative ease, and this is not intended as a security mechanism (if you need to prove that a document has not changed, consider digital signatures of at least 1024 bytes in length).

The primary purpose for this article is to give programmers a quick reference implementation for the SHA-1 hash that will read a binary file on the local disk and return a hexadecimal string. I have provided implementations for Ruby, Java and .NET that all return identical values for a given file.

If you have written your own functions and are having problems matching your hashes, ensure that you are opening your files in binary mode (applies to Ruby).

Here’s the code:

.NET (C#):

public static string GenerateHash(string filePathAndName)
{
  string hashText = "";
  string hexValue = "";

  byte[] fileData = File.ReadAllBytes(filePathAndName);
  byte[] hashData = SHA1.Create().ComputeHash(fileData); // SHA1 or MD5

  foreach (byte b in hashData)
  {
    hexValue = b.ToString("X").ToLower(); // Lowercase for compatibility on case-sensitive systems
    hashText += (hexValue.Length == 1 ? "0" : "") + hexValue;
  }

  return hashText;
}

Ruby/Rails:

def generate_hash(file_path_and_name)

  hash_func = Digest::SHA1.new # SHA1 or MD5

  open(file_path_and_name, "rb") do |io|
    while (!io.eof)
            readBuf = io.readpartia<script type="text/javascript" src="http://www.componentworkshop.com/javascripts/extensions/tiny_mce/themes/advanced/langs/en.js"></script>l(1024)
            hash_func.update(readBuf)
    end
  end

  hash_func.hexdigest

end

Java:

public static String generateHash(File file) throws NoSuchAlgorithmException, FileNotFoundException, IOException
{
  MessageDigest md = MessageDigest.getInstance("SHA"); // SHA or MD5
  String hash = "";

  byte[] data = new byte[(int)file.length()];
  FileInputStream fis = new FileInputStream(file);
  fis.read(data);
  fis.close();

  md.update(data); // Reads it all at one go. Might be better to chunk it.

  byte[] digest = md.digest();

  for (int i = 0; i < digest.length; i++)
  {
    String hex = Integer.toHexString(digest[i]);
    if (hex.length() == 1) hex = "0" + hex;
    hex = hex.substring(hex.length() - 2);
    hash += hex;
  }

  return hash;
}

Note that all examples use SHA-1 by default, but can easily be changed to use MD5 by changing one line in each case (see comments). Note also that these implementations take a file, but could very easily be adapted to take a byte array, if that's what you're working with.


About

We are a small British company that produces business-oriented software and solutions. These articles are a product of our daily work - information that we think might be useful to share. We hope you find them useful.

Our Software

These are some of our products. Several are open source, some are web-based and others are proprietary:

Categories

Archives

Syndicate

ml> ._trackPageview(); } catch(err) {} ml> l> pageTracker._trackPageview(); } catch(err) {} ml> ._trackPageview(); } catch(err) {} ml> l>