Visual guide of C# Extract Text from Image

Optical Character Recognition (OCR) technology bridges the physical and digital worlds by converting text within images into machine-readable data. For .NET developers, the ability to extract text from images in C# is essential for building intelligent document processing, automated data entry, and accessibility solutions.

In this article, we’ll explore how to implement OCR in C# using the Spire.OCR for .NET library, covering basic extraction, advanced features like coordinate tracking, and best practices to ensure accuracy and efficiency.

Table of Contents:

Understanding OCR and Spire.OCR

What is OCR?

OCR technology analyzes images of text - such as scanned documents, screenshots, or photos - and converts them into text strings that can be edited, searched, or processed programmatically.

Why Spire.OCR Stands Out?

Spire.OCR for .NET is a powerful, developer-friendly library that enables highly accurate text recognition from images in C# applications. Key features include:

  • Support for multiple languages (English, Chinese, Japanese, etc.).
  • High accuracy recognition algorithms optimized for various fonts and styles.
  • Text coordinate extraction for precise positioning.
  • Batch processing capabilities.
  • Compatibility with .NET Framework and .NET Core.

Setting Up Your OCR Environment

Before diving into the C# code for image to text OCR operations, configure your development environment first:

1. Install via NuGet:

Open the NuGet Package Manager in Visual Studio. Search for "Spire.OCR" and install the latest version in your project. Alternatively, use the Package Manager Console:

Install-Package Spire.OCR

2. Download OCR Models:

Spire.OCR relies on pre-trained models to recognize image text. Download the model files for your operating system:

After downloading, extract to a directory (e.g., F:\OCR Model\win-x64)

Important Note: Remember to change the platform target of your solution to x64 as Spire.OCR only supports 64-bit platforms.

Set x64 platform target for OCR project


Basic Recognition: Extract Text from Images in C#

Let’s start with a simple example that demonstrates how to read text from an image using Spire.OCR.

C# code to get text from an image:

using Spire.OCR;
using System.IO;

namespace OCRTextFromImage
{
    internal class Program
    {
        static void Main(string[] args)
        {

            // Create an instance of the OcrScanner class
            OcrScanner scanner = new OcrScanner();

            // Create an instance of the ConfigureOptions class
            ConfigureOptions configureOptions = new ConfigureOptions();

            // Set the path to the OCR model
            configureOptions.ModelPath = "F:\\OCR Model\\win-x64";

            // Set the language for text recognition. (The default is English.)
            configureOptions.Language = "English";

            // Apply the configuration options to the scanner
            scanner.ConfigureDependencies(configureOptions);

            // Scan image and extract text
            scanner.Scan("sample.png");

            // Save the extracted text to a txt file
            string text = scanner.Text.ToString();
            File.WriteAllText("output.txt", text);
        }
    }
}

Code Explanation:

  • OcrScanner: Core class for text recognition.
  • ConfigureOptions: Sets OCR parameters:
    • ModelPath: Specifies the path to the OCR model files.
    • Language: Defines the recognition language (e.g., "English", "Chinese").
  • Scan(): Processes image and extracts text using the configured settings.

Output:

This C# code processes an image file (sample.png) and saves the extracted text to a text file (output.txt) using File.WriteAllText().

Extract or read text from a PNG image.

Advanced Extraction: Extract Text with Coordinates in C#

In many cases, knowing the position of extracted text within an image is as important as the text itself - for example, when processing invoices, forms, or structured documents. Spire.OCR allows you to extract not just text but also the coordinates of the text blocks, enabling precise analysis.

C# code to extract text with coordinates from an Image:

using Spire.OCR;
using System.Collections.Generic;
using System.IO;

namespace OCRWithCoordinates
{
    internal class Program
    {
        static void Main(string[] args)
        {

            // Create an instance of the OcrScanner class
            OcrScanner scanner = new OcrScanner();

            // Create an instance of the ConfigureOptions class
            ConfigureOptions configureOptions = new ConfigureOptions();

            // Set the path to the OCR model
            configureOptions.ModelPath = "F:\\OCR Model\\win-x64";

            // Set the language for text recognition. (The default is English.)
            configureOptions.Language = "English";

            // Apply the configuration options to the scanner
            scanner.ConfigureDependencies(configureOptions);

            // Extract text from an image
            scanner.Scan("invoice.png");

            // Get the OCR result text
            IOCRText text = scanner.Text;

            // Create a list to store information
            List<string> results = new List<string>();

            // Iterate through each block of the OCR result text
            foreach (IOCRTextBlock block in text.Blocks)
            {
                // Add the text of each block and its location information to the list
                results.Add($"Block Text: {block.Text}");
                results.Add($"Coordinates: {block.Box}");
                results.Add("---------");
            }

            // Save the extracted text with coordinates to a txt file
            File.WriteAllLines("ExtractWithCoordinates.txt", results);
        }
    }
}

Critical Details

  • IOCRText: Represents the entire OCR result.
  • IOCRTextBlock: Represents a block of contiguous text (e.g., a paragraph, line, or word).
  • IOCRTextBlock.Box: Contains the rectangular coordinates of the text block:
    • X (horizontal position)
    • Y (vertical position)
    • Width
    • Height

Output:

This C# code performs OCR on an image file (invoice.png), extracting both the recognized text and its position coordinates in the image, then saves this information to a text file (ExtractWithCoordinates.txt).

Get text with coordinate information from a PNG image.


Tips to Optimize OCR Accuracy

To ensure reliable results when using C# to recognize text from images, consider these best practices:

  • Use high-resolution images (300 DPI or higher).
  • Preprocess images (e.g., resize, deskew) for better results.
  • Ensure correct language settings correspond to the text in image.
  • Store OCR models in a secure, accessible location.

FAQs (Supported Languages and Image Formats)

Q1: What image formats does Spire.OCR support?

A: Spire.OCR supports all common formats:

  • PNG
  • JPEG/JPG
  • BMP
  • TIFF
  • GIF

Q2: What languages does Spire.OCR support?

A: Multiple languages are supported:

  • English (default)
  • Chinese (Simplified and Traditional)
  • Japanese
  • Korean
  • German
  • French

Q3: Can I use Spire.OCR in ASP.NET Core applications?

A: Yes. Supported environments:

  • .NET Framework 2.0+
  • .NET Standard 2.0+
  • .NET Core 2.0+
  • .NET 5

Q4: Can Spire.OCR extract text from scanned PDFs in C#?

A: The task requires the Spire.PDF integration to convert PDFs to images or extract images from scanned PDFs first, and then use the above C# examples to get text from the images.


Conclusion & Free License

Spire.OCR for .NET provides a powerful yet straightforward solution for extracting text from images in C# applications. Whether you’re building a simple tool to convert images to text or a complex system for processing thousands of invoices, by following the techniques and best practices outlined in this guide, you can integrate OCR functionality into your C# applications with ease.

Request a 30-day trial license here to get unlimited OCR capabilities and unlock valuable information trapped in visual format.

Published in Recognize Text

Spire.OCR for .NET offers developers a new model to extract text from images. In this article, we will demonstrate how to extract text from images in C# using the new model of Spire.OCR for .NET.

The detailed steps are as follows.

Step 1: Create a Console App (.NET Framework) in Visual Studio.

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Step 2: Change the platform target of the application to x64.

In the application's Solution Explorer, right-click on the project's name and then select "Properties".

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Change the platform target of the application to x64. This step must be performed since Spire.OCR only supports 64-bit platforms.

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Step 3: Install Spire.OCR for .NET in your application.

Install Spire.OCR for .NET through NuGet by executing the following command in the NuGet Package Manager Console:

Install-Package Spire.OCR

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Step 4: Download the new model of Spire.OCR for .NET.

Download the model that fits in with your operating system from one of the following links.

Windows x64

Linux x64

macOS 10.15 and later

Linux aarch

Then extract the package and save it to a specific directory on your computer. In this example, we saved the package to "D:\".

C#: Extract Text from Images using the New Model of Spire.OCR for .NET

Step 5: Use the new model of Spire.OCR for .NET to extract text from images in C#.

The following code example shows how to extract text from an image using C# and the new model of Spire.OCR for .NET:

  • C#
using Spire.OCR;
using System.IO;

namespace NewOCRModel
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Set license key
            // Spire.OCR.License.LicenseProvider.SetLicenseKey("your-license-key");

            // Create an instance of the OcrScanner class
            OcrScanner scanner = new OcrScanner();

            // Create an instance of the ConfigureOptions class to set up the scanner configuration
            ConfigureOptions configureOptions = new ConfigureOptions();

            // Set the path to the new model
            configureOptions.ModelPath = "D:\\win-x64";

            // Set the language for text recognition. The default is English.
            // Supported languages include English, Chinese, Chinesetraditional, French, German, Japanese, and Korean.
            configureOptions.Language = "English";

            // Apply the configuration options to the scanner
            scanner.ConfigureDependencies(configureOptions);

            // Extract text from an image
            scanner.Scan("test.png");

            //Save the extracted text to a text file
            string text = scanner.Text.ToString();
            File.WriteAllText("Output.txt", text);
        }
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Recognize Text

Spire.OCR for .NET is a professional OCR library that supports recognizing text from Images (such as JPG, PNG, GIF, BMP, and TIFF) in both .NET Framework and .NET Core applications. In this article, we will explain how to use Spire.OCR for .NET to read text from images in .NET Framework applications.

Step 1: Create a console application (.NET Framework) in Visual Studio.

How to use Spire.OCR in .NET Framework Applications

How to use Spire.OCR in .NET Framework Applications

Step 2: Change the platform target of the application to X64.

In the application's solution explorer, right-click on the solution name and then click "Properties".

How to use Spire.OCR in .NET Framework Applications

Change the platform target of the application to X64. This step must be performed since Spire.OCR only supports 64-bit platforms.

How to use Spire.OCR in .NET Framework Applications

Step 3: Add a reference to Spire.OCR for .NET DLL in the application.

We recommend installing Spire.OCR for .NET through NuGet (Note: only Spire.OCR for .NET Version 1.8.0 or above supports working with .NET Framework). The detailed steps are as follows:

  • In the application's solution explorer, right-click on the solution name or "References" and select "Manage NuGet Packages".
  • Click the "Browse" tab and search for Spire.OCR.
  • Click "Install" to install Spire.OCR.

How to use Spire.OCR in .NET Framework Applications

Step 4: Copy DLLs from the "packages" directory to the "Debug" directory in the application.

When you install Spire.OCR through NuGet, NuGet downloads the packages and puts them in your application under a directory called "packages". You need to find the "Spire.OCR" directory under the "packages" directory, then copy the DLLs under the "Spire.OCR" directory (packages\Spire.OCR.1.8.0\runtimes\win-x64\native) to the "Debug" directory of your application.

How to use Spire.OCR in .NET Framework Applications

Now you have successfully included Spire.OCR in your .NET Framework application. You can refer to the following code example to read text from images using Spire.OCR.

  • C#
using Spire.OCR;
using System.IO;

namespace OcrImage
{
    internal class Program
    {
        static void Main(string[] args)
        {
            //Create an instance of the OcrScanner class
            OcrScanner scanner = new OcrScanner();

            //Call the OcrScanner.Scan() method to scan text from an image
            scanner.Scan("image.png");

            //Save the scanned text to a .txt file
            string text = scanner.Text.ToString();            
            File.WriteAllText("output.txt", text);
        }
    }
}

Apply for a Temporary License

If you'd like to remove the evaluation message from the generated documents, or to get rid of the function limitations, please request a 30-day trial license for yourself.

Published in Recognize Text

This article demonstrates the steps to use Spire.OCR for .NET in .NET Core applications.

Step 1: Create a .NET Core project in Visual Studio.

How to use Spire.OCR for .NET in .NET Core Applications

Step 2: Add reference to Spire.OCR for .NET DLLs in your project.

You can add reference to Spire.OCR for .NET DLLs through one of the following two ways:

1. Install Spire.OCR for .NET through NuGet using NuGet Package Manager (recommended):

  • In Solution Explorer, right-click the project or "Dependencies" and select "Manage NuGet Packages".
  • Click "Browse" tab and search Spire.OCR.
  • Install Spire.OCR.

How to use Spire.OCR for .NET in .NET Core Applications

2. Manually add reference to Spire.OCR for .NET DLLs.

  • Download Spire.OCR for .NET package from the following link, unzip it, you will get the DLLs from the "netstandard2.0" folder.
  • How to use Spire.OCR for .NET in .NET Core Applications

  • Right-click the project or "Dependencies" – select "Add Reference" – click "Browse" – select all DLLs under "netstandard2.0" folder – click "Add".
  • How to use Spire.OCR for .NET in .NET Core Applications

  • Install the other two packages: SkiaSharp and System.Text.Encoding.CodePages in your project via the NuGet Package Manager.
  • Right-click the project or "Dependencies" – select "Manage NuGet Packages" – click "Browse" – type the package name – select the package from the search results – click "Install".

    How to use Spire.OCR for .NET in .NET Core Applications

Note: If you fail to find these packages in the NuGet Package Manager, check if you have set the "nuget.org" as the "Package source".

Step 3: Copy dependency DLLs to running directory of your project.

If you install Spire.OCR from NuGet and your project's target framework is .NET Core 3.0 or above, please build the project, then copy the 6 DLLs from bin\Debug\netcoreapp3.0\runtimes\win-x64\native folder to the running directory such as bin\Debug\netcoreapp3.0 or C:\Windows\System32 .

How to use Spire.OCR for .NET in .NET Core Applications

If your project's target framework is below .NET Core 3.0 or you download Spire.OCR from our website, please copy the 6 DLLs from Spire.OCR\Spire.OCR_Dependency\x64 folder to the running directory such as bin\Debug\netcoreapp2.1 or C:\Windows\System32.

How to use Spire.OCR for .NET in .NET Core Applications

Step 4: Now you have successfully included Spire.OCR in your project. You can refer the following code example to scan images using Spire.OCR.

  • C#
using Spire.OCR;
using System.IO;

namespace SpireOCR
{
    class Program
    {
        static void Main(string[] args)
        {
            OcrScanner scanner = new OcrScanner();            
            scanner.Scan("image.png");
            File.WriteAllText("output.txt", scanner.Text.ToString());
        }
    }
}
Published in Recognize Text