String Encoding/Decoding and Conversions in C#

All strings in a .NET Framework program are stored as 16-bit Unicode characters. At times you might need to convert from Unicode to some other character encoding, or from some other character encoding to Unicode. The .NET Framework provides several classes for encoding (converting Unicode characters to a block of bytes in another encoding) and decoding (converting a block of bytes in another encoding to Unicode characters.
 
The System.Text namespace has a number of Encoding implementations: 
  1. ASCIIEncoding class encodes Unicode characters as single 7-bit ASCII characters. This class supports only character values between U+0000 and U+007F.
  2. UnicodeEncoding class encodes each Unicode character as two consecutive bytes. This supports both little-endian (code page 1200) and big-endian (code page 1201) byte orders.
  3. UTF7Encoding class encodes Unicode characters using UTF-7 encoding (UTF-7 stands for UCS Transformation Format, 8-bit form). This supports all Unicode character values and can also be accessed as code page 65000.
  4. UTF8Encoding class encodes Unicode characters using UTF-8 encoding (UTF-8 stands for UCS Transformation Format, 8-bit form). This supports all Unicode character values and can also be accessed as code page 65001. 
Each of these classes has methods for both encoding (such as GetBytes) and decoding (such as GetChars) a single array all at once. In addition, each supports GetEncoder and GetDecoder, which return encoders and decoders capable of maintaining shift state so they can be used with streams and blocks.
 
Listing 20.33 shows various forms of the Encoding class.
 
Listing 20.33: Encoding and Decoding 
  1. // writing  
  2. FileStream fs = new FileStream("text.txt", FileMode.OpenOrCreate);  
  3. StreamWriter t = new StreamWriter(fs, Encoding.UTF8);  
  4. t.Write("This is in UTF8");  
  5. //or  
  6. // reading  
  7. FileStream fs = new FileStream("text.txt", FileMode.Open);  
  8. StreamReader t = new StreamReader(fs, Encoding.UTF8);  
  9. String s = t.ReadLine();  
Listing 20.34 makes a Web page request and then encodes the bytes returned/read as ASCII characters.
 
Listing 20.34: String Encoding 
  1. // encoding example  
  2. using System;  
  3. using System.Net;  
  4. using System.IO;  
  5. using System.Text;  
  6. class MyApp  
  7. {  
  8. static void Main()  
  9. {  
  10. try  
  11. {  
  12. WebRequest theRequest =  
  13. WebRequest.Create(@"http://www.c-sharpcorner.com");  
  14. WebResponse theResponse = theRequest.GetResponse();  
  15. int BytesRead = 0;  
  16. Byte[] Buffer = new Byte[256];// Buffer Size  
  17. Stream ResponseStream = theResponse.GetResponseStream();  
  18. BytesRead = ResponseStream.Read(Buffer, 0, 256);  
  19. StringBuilder strResponse = new StringBuilder(@"");  
  20. while (BytesRead != 0)  
  21. {  
  22. // Returns an encoding for the ASCII (7 bit) character set  
  23. // ASCII characters are limited to the lowest 128 Unicode  
  24. // characters  
  25. // , from U+0000 to U+007f.  
  26. strResponse.Append(Encoding.ASCII.GetString(Buffer,  
  27. 0, BytesRead));  
  28. BytesRead = ResponseStream.Read(Buffer, 0, 256);  
  29. }  
  30. Console.Write(strResponse.ToString());  
  31. }  
  32. catch (Exception e)  
  33. {  
  34. Console.Write("Exception Occured!{0}", e.ToString());  
  35. }  
  36. }  
  37. }  
 
Conclusion
 
Hope this article would have helped you in understanding the String Encoding/Decoding and Conversions in C#. 
 
Note: This article has been excerpted from book "The Complete Visual C# Programmer's Guide" from the Authors of C# Corner. 


Similar Articles