How Becker & Poliakoff Utilized contentCrawler to Centralize OCR Processing

Tue 28 Jun 2022


contentCrawler from Litera replaced a costly, cumbersome four-step OCR’ing workflow process at Becker & Poliakoff, streamlining it into an efficient single-step process. OCR processing – which used to be expensive and time-consuming for the firm – is now accomplished within a centralized framework, saving time and allowing users to stay in control of their workspace.


Staff at Becker & Poliakoff were using a four-step process to convert paper documents to text-searchable PDFs. They would scan the documents into their system, open the documents in pdfDocs (a PDF creation and editing application from Litera), OCR the documents by adding a text layer, and profile the new text-searchable PDF into their DMS.

Though this workflow helped meet regulatory requirements around storing files as PDFs, the whole process was extremely timeconsuming and error-prone. If staff didn’t follow all the steps in the process or skipped the process entirely, the documents could easily be profiled into the DMS as non-searchable documents.


Becker & Poliakoff IT staff knew they needed an automated end-to-end system that could handle the conversion of documents to text-searchable PDFs more efficiently. They reached out to Litera to deploy contentCrawler.

contentCrawler is an integrated analysis, processing and reporting tool that assesses image documents in content repositories to apply OCR conversion. contentCrawler’s automated end-to-end process converts image documents to text-searchable PDFs, which are saved back into the content repository - ready to be found by searching. This approach reduces non-compliance risks and increases efficiency.

Becker & Poliakoff decided to run contentCrawler in Active Monitoring and Backlog modes simultaneously. Active Monitoring performed OCR processing on all newly profiled documents in the DMS. Backlog processing, on the other hand, converted all the legacy documents in the system to text-searchable PDFs. This process took longer as there were hundreds of thousands of documents to be assessed and converted to PDF. contentCrawler can work through 17,000 pages per day on a 4-core machine. Firms can add additional cores to speed up the process.

About Litera Litera is the leading provider of software for law firms and document-intensive organizations across the globe, helping them satisfy client demands. Our document drafting products empower users to create, proofread, compare, clean, and distribute high-quality content quickly and securely, from any device, while our transaction management platform converts the manual, tedious process of managing transactions by creating a secure, collaborative workspace and automating the entire signature process. 


Having non-searchable documents in its DMS was an unacceptable and unnecessary risk for Becker & Poliakoff. To resolve the issue, the firm deployed contentCrawler – an automated, end-to-end OCR framework from Litera. Now, all documents saved into their DMS are fully text searchable and compliant with industry requirements.

CIO & IT contentCrawler Firm Governance


Transform Your Law Firm’s Data into Actionable Insights

The Ultimate Data Solution for Law Firms In the legal sector, safeguarding client data and adhering to compliance standards is paramount...

8 Legal Marketing Ideas to Help Your Law Firm Win More Business

In the highly competitive legal landscape, standing out from the crowd is crucial for the success of your law firm. While providing...

2024 Email Marketing Benchmark Report for Legal and Professional Services

Did you know that email campaigns sent on Fridays have an average open rate of 36.7% – higher than those sent on any other working day? To...

Ready to get started?

Join over 4,000+ firms already growing with Litera.