ImageEn for Delphi and C++ Builder ImageEn for Delphi and C++ Builder

 

ImageEn Forum
Profile    Join    Active Topics    Forum FAQ    Search this forumSearch
Forum membership is Free!  Click Join to sign-up
Username:
Password:
Save Password
Forgot your Password?

 All Forums
 ImageEn Library for Delphi, C++ and .Net
 ImageEn and IEvolution Support Forum
 Need to extract all plain text from a PDF file
 New Topic  Reply to Topic
Author Previous Topic Topic Next Topic  

PeterPanino

951 Posts

Posted - Feb 12 2025 :  11:53:39  Show Profile  Reply
I have tried the following:

var ThisPdfDoc := iexPdfiumCore.TPdfDocument.Create;
try
  ThisPdfDoc.LoadFromFile(APdfFile);
  for var i := 0 to ThisPdfDoc.PageCount - 1 do
  begin
    //ThisPdfDoc.Pages[i]. -> Unfortunately, there is no method to extract the text from the whole page!
  end;
finally
  ThisPdfDoc.Free;
end;

xequte

38796 Posts

Posted - Feb 12 2025 :  18:41:35  Show Profile  Reply
Please see the example at:

http://www.imageen.com/help/TIEPdfViewer.SelText.html

Alternatively you can iterate through all the text objects in the page:

http://www.imageen.com/help/TIEPdfViewer.Objects.html

Nigel
Xequte Software
www.imageen.com
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
Jump To: