Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type PdfParser ¶
type PdfParser struct {
// contains filtered or unexported fields
}
PdfParser is a wrapper around the go-fitz library.
func Open ¶
Open creates a new PdfParser and opens a PDF file at the specified path.
Parameters:
- path: The path of the PDF file to open.
Returns:
- *PdfParser: A pointer to the PdfParser object if the file was opened successfully.
- error: An error object if there was an error opening the file.
func OpenReader ¶
OpenReader creates a new PdfParser from an io.Reader.
Parameters:
- r: The io.Reader from which to create the PdfParser.
Returns:
- *PdfParser: The created PdfParser.
- error: Any error that occurred during the creation of the PdfParser.
func OpenURL ¶
OpenURL creates a new PdfParser by reading the specified URL.
Parameters:
- u: the URL to open as a string.
Returns:
- pp: a pointer to a PdfParser object.
- statusCode: an integer representing the HTTP status code of the URL response.
- err: an error object, if any error occurred during the process.
func (*PdfParser) ExtractPageTexts ¶
ExtractPageTexts extracts the text from the specified pages(start 0) of a PDF document.
Parameters:
- pages: A variadic parameter representing the page numbers to extract the text from.
Returns:
- A string containing the text content of the specified pages.
- An error if any error occurs during the extraction process.
func (*PdfParser) ExtractTexts ¶
ExtractTexts extracts the text from all pages of a PDF document.
Parameters:
- None
Returns:
- A string containing the text content of all pages seperated by the pageSep.
- An error if any error occurs during the extraction process.
func (*PdfParser) SetPageSep ¶
SetPageSep sets the page text separator for the PdfParser. Default is "-"x100.
Click to show internal directories.
Click to hide internal directories.