Scan Barcode From PDF Using ITextSharp

Background

I viewed this article written by Sourav Kayal some days ago and an idea just popped up in my head: Can I use this library to scan barcodes from a PDF document? So I downloaded this library and tested it for the purpose of scanning barcodes from a PDF document. But the library failed to scan the barcodes directly in the PDF document.

Introduction


There is a very popular free .NET PDF library called iTextSharp. I used it to process a PDF document sometimes. So I tried to complete the job using iTextSharp and the library introduced in the first paragraph. And it worked. So I want to share the solution with you guys here.

What iTextSharp support:

  • PDF generation
  • PDF manipulation (stamping watermarks, merging/splitting PDFs and so on)
  • PDF form filling
  • XML functionality
  • Digital signatures

 

  1. Simple code showing how to use iTextSharp:  
  2. // create a document object  
  3. Document document = new Document(PageSize.A4, 50, 50, 25, 25);  
  4.   
  5. // create a new PdfWriter object, specifying the output stream  
  6. FileStream output = new FileStream("firstPdf.pdf", FileMode.Create);  
  7.   
  8. var writer = PdfWriter.GetInstance(document, output);  
  9.   
  10. // open the document for writing  
  11. document.Open();  
  12.   
  13. // create a new paragraph object with the text, "Hello, World!"  
  14. Paragraph welcomeParagraph = new Paragraph("Hello, World!");  
  15.   
  16. // add the paragraph object to the document  
  17. document.Add(welcomeParagraph);  
  18.   
  19. // close the document - this saves the document contents to the output stream  
  20. document.Close(); 

Screenshot



Sample Code to Scan Barcode from PDF

In this part, I present you complete code to fulfill the job. If you don't know how to use the barcode library, please check Sourav Kayal's article. The “Hello, World!” code will make you understand how to use iTextSharp.

The method GetImages uses iTextSharp to extract images from a PDF document.

Source PDF document


The method GetImages

  1. private static void GetImages(string filename)  
  2. {  
  3. int pageNum = 1;  
  4.   
  5. PdfReader pdf = new PdfReader(filename);  
  6. PdfDictionary pg = pdf.GetPageN(pageNum);  
  7. PdfDictionary res = (PdfDictionary)PdfReader.GetPdfObject(pg.Get(PdfName.RESOURCES));  
  8. PdfDictionary xobj = (PdfDictionary)PdfReader.GetPdfObject(res.Get(PdfName.XOBJECT));  
  9. if (xobj == null) { return; }  
  10. foreach (PdfName name in xobj.Keys)  
  11. {  
  12. PdfObject obj = xobj.Get(name);  
  13. if (!obj.IsIndirect()) { continue; }  
  14. PdfDictionary tg = (PdfDictionary)PdfReader.GetPdfObject(obj);  
  15. PdfName type = (PdfName)PdfReader.GetPdfObject(tg.Get(PdfName.SUBTYPE));  
  16. if (!type.Equals(PdfName.IMAGE)) { continue; }  
  17. int XrefIndex = Convert.ToInt32(((PRIndirectReference)obj).Number.ToString(System.Globalization.CultureInfo.InvariantCulture));  
  18. PdfObject pdfObj = pdf.GetPdfObject(XrefIndex);  
  19. PdfStream pdfStrem = (PdfStream)pdfObj;  
  20. byte[] bytes = PdfReader.GetStreamBytesRaw((PRStream)pdfStrem);  
  21. if (bytes == null) { continue; }  
  22. using (System.IO.MemoryStream memStream = new System.IO.MemoryStream(bytes))  
  23. {  
  24. memStream.Position = 0;  
  25. System.Drawing.Image img = System.Drawing.Image.FromStream(memStream);  
  26.   
  27. string path = Path.Combine(String.Format(@"result-{0}.jpg", pageNum));  
  28. System.Drawing.Imaging.EncoderParameters parms = new System.Drawing.Imaging.EncoderParameters(1);  
  29. parms.Param[0] = new System.Drawing.Imaging.EncoderParameter(System.Drawing.Imaging.Encoder.Compression, 0);  
  30. var jpegEncoder = ImageCodecInfo.GetImageEncoders().ToList().Find(x => x.FormatID == ImageFormat.Jpeg.Guid);  
  31. img.Save(path, jpegEncoder, parms);  
  32. }  
  33. }  

The following shows how to use the barcode library to scan the extracted image:

  1. //get images from source2.pdf  
  2. GetImages("source2.pdf");  
  3.   
  4. //scan the images for barcode  
  5. bool imageExist = File.Exists("result-1.jpg");  
  6. if (imageExist)  
  7. {  
  8. string scaningResult = Spire.Barcode.BarcodeScanner.ScanOne("result-1.jpg");  
  9. Console.WriteLine(scaningResult);  
  10. }  
  11.   
  12. Console.WriteLine("Done!");  
  13. Console.ReadLine(); 

Result



Conclusion

You are welcome to test the code to scan barcodes from a PDF document. I hope this article may provide you some help in programming.