Commit cce891c5 authored by Carsten  Rose's avatar Carsten Rose
Browse files

WIP #F10715 - qfqpdf

parent 187eb378
Pipeline #6588 failed with stages
in 1 minute and 40 seconds
......@@ -42,7 +42,7 @@ The following features are only tested / supported on linux hosts:
* General: QFQ is coded to run on Linux hosts, preferable on Debian derivates like Ubuntu.
* PHP Module: curl
* HTML to PDF conversion - command `wkhtmltopdf`.
* HTML to PDF conversion - command `wkhtmltopdf` and `qfqpdf`.
* Concatenation of PDF files - command `pdfunite`.
* Convert of images to PDF files - command `img2pdf`.
* PDF decrypt (used for merge with pdfunite) - command `qpdf`.
......@@ -75,8 +75,23 @@ Preparation for Ubuntu::
.. _wkhtml:
wkhtmltopdf
^^^^^^^^^^^
HTML to PDF: qfqpdf
^^^^^^^^^^^^^^^^^^^
The below named `wkhtml` becomes more and more outdated (no further development). As a replacement, QFQ started to use
puppeteer (https://developers.google.com/web/tools/puppeteer/). The tool can't be used directly, a wrapper is neccessary
which is available under https://git.math.uzh.ch/bbaer/qfqpdf. The wrapper uses and installs always the latest version of
puppeteer. On first start and during updates, it might take longer to render a pdf.
Installation::
curl -L -o qfqpdf https://www.math.uzh.ch/repo/qfqpdf/current/qfqpdf-linux
chmod a+x qfqpdf
./qfqpdf --version
HTML to PDF: wkhtmltopdf
^^^^^^^^^^^^^^^^^^^^^^^^
`wkhtmltopdf <http://wkhtmltopdf.org/>`_ will be used by QFQ to offer 'website print' and 'HTML to PDF' conversion.
The program is not included in QFQ and has to be manually installed.
......@@ -143,11 +158,6 @@ Checklist wkhtml problems
* Check the `--cookie-jar '/tmp/qfq.cookie....'` file for the cookie.
* Call wkhtml manually on the webserver, with the same options as given in the :ref:`QFQ_LOG`.
HTML to PDF conversion
""""""""""""""""""""""
`wkhtmltopdf` converts a website (local or remote) to a (multi)-page PDF file. It's mainly used in :ref:`download`.
Print
"""""
......
......@@ -768,7 +768,8 @@ Column: _link
+---+---+--------------+-----------------------------------+---------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| | |SIP |s[:0|1] |s, s:0, s:1 |If 's' or 's:1' a SIP entry is generated with all non Typo 3 Parameters. The URL contains only parameter 's' and Typo 3 parameter |
+---+---+--------------+-----------------------------------+---------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| | |Mode |M:file|pdf|zip |M:file, M:pdf, M:zip |Mode. Used to specify type of download. One or more element sources needs to be configured. See :ref:`download`. |
| | |Mode |M:file|pdf|qfqpdf|zip |M:file, M:pdf, M:qfqpdf, |Mode. Used to specify type of download. One or more element sources needs to be configured. See :ref:`download`. |
| | | | |M:zip | |
+---+---+--------------+-----------------------------------+---------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| | |File |F:<filename> |F:fileadmin/file.pdf |Element source for download mode file|pdf|zip. See :ref:`download`. |
+---+---+--------------+-----------------------------------+---------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
......@@ -2130,11 +2131,11 @@ By using the ``_link`` column name:
* in ``persistent link`` mode: path download script (optional) and key(s) to identify the record with the PathFilename
information (see below).
* the optional ``M:...`` (Mode) specifies the export type (file, pdf, zip, export),
* the optional ``M:...`` (Mode) specifies the export type (file, pdf, qfqpdf, zip, export),
* the alttext ``a:...`` specifies a message in the download popup.
By using ``_pdf``, ``_Pdf``, ``_file``, ``_File``, ``_zip``, ``_Zip``, ``_excel`` as column name, the options `d`, `M` and `s`
will be set.
By using ``_pdf``, ``_Pdf``, ``_file``, ``_File``, ``_zip``, ``_Zip``, ``_excel`` as column name, the options `d`,
`M` (pdf: wkhtml) and `s` will be set.
All files will be read by PHP - therefore the directory might be protected against direct web access. This is the
preferred option to offer secure downloads via QFQ. Check `secure-direct-file-access`_.
......@@ -2201,7 +2202,10 @@ Parameter and (element) sources
* *mode*: ``M:<mode>``
* *mode* = <file | pdf | zip | excel>
* *mode* = <file | pdf | qfqpdf | zip | excel>
* pdf: `wkhtml` will be used to render the pdf.
* qfqpdf: `qfqpdf` will be used to render the pdf.
* If ``M:file``, the mime type is derived dynamically from the specified file. In this mode, only one element source
is allowed per download link (no concatenation).
......@@ -2214,7 +2218,7 @@ Parameter and (element) sources
* If only one `file` is specified, the default is `file`.
* If there is a) a page defined or b) multiple elements, the default is `pdf`.
* *element sources* - for ``M:pdf`` or ``M:zip``, all of the following element sources may be specified multiple times.
* *element sources* - for ``M:pdf``, ``M:qfqpdf`` or ``M:zip``, all of the following element sources may be specified multiple times.
Any combination and order of these options are allowed.
* *file*: ``F:<pathFileName>`` - relative or absolute pathFileName offered for a) download (single), or to be concatenated
......@@ -2259,7 +2263,7 @@ Parameter and (element) sources
* Tip: For more obviously structuring, put the additional tt-content record on the same Typo3 page (where the QFQ
tt-content record is located which produces the link) and specify ``render = api`` (`report-render`_).
* *WKHTML Options* for `page`, `urlParam` or `url`:
* 'M:pdf' - *WKHTML Options* for `page`, `urlParam` or `url`:
* The 'HTML to PDF' will be done via `wkhtmltopdf`.
* All possible options, suitable for `wkhtmltopdf`, can be submitted in the `p:...`, `u:...` or `U:...` element source.
......@@ -2268,6 +2272,17 @@ Parameter and (element) sources
the key/value tuple in `p:...`, `u:...` or `U:...` has to be separated by '='. Please see last example below.
* If an option contains an '&' it must be escaped with double \\ . See example.
* 'M:qfqpdf' - *qfqpdf Options* for `page`, `urlParam` or `url`:
* The 'HTML to PDF' will be done via `qfqpdf`.
* Check https://puppeteer.github.io/puppeteer and https://git.math.uzh.ch/bbaer/qfqpdf/-/tree/master
* All possible options, suitable for `qfqpdf`, can be submitted in the `p:...`, `u:...` or `U:...` element source.
Be aware that
key/value tuple in the documentation is separated by a space, but to respect the QFQ key/value notation of URLs,
the key/value tuple in `p:...`, `u:...` or `U:...` has to be separated by '='. Please see last example below.
* If an option contains an '&' it must be escaped with double \\ . See example.
* Page numbering is done via HTML templating / CSS classes: ``--header-template '<div style="font-size:5mm;" class="pageNumber"></div>'``
Most of the other Link-Class attributes can be used to customize the link as well.
Example `_link`: ::
......@@ -2281,6 +2296,9 @@ Example `_link`: ::
# three sources: two pages and one file
SELECT "d:complete.pdf|s|t:Complete PDF|p:id=detail&r=1|p:id=detail2&r=1|F:fileadmin/pdf/test.pdf" AS _link
# qfqpdf - three sources: two pages and one file
SELECT "d:complete.pdf|M:qfqpdf|s|t:Complete PDF|p:id=detail&r=1|p:id=detail2&r=1|F:fileadmin/pdf/test.pdf" AS _link
# three sources: two pages and one file
SELECT "d:complete.pdf|s|t:Complete PDF|p:id=detail&r=1|p:id=detail2&r=1|F:fileadmin/pdf/test.pdf" AS _link
......@@ -2296,8 +2314,8 @@ Example `_link`: ::
# One indirect source reference
SELECT "d:complete.pdf|s|t:Complete PDF|source:centralPdf&pId=1234" AS _link
An additional tt-content record is defined with `sub header: centralPdf`. One or multiple attachments might be concatenated.
10.sql = SELECT '|F:', a.pathFileName FROM Attachments AS a WHERE a.pId={{pId:S}}
An additional tt-content record is defined with `sub header: centralPdf`. One or multiple attachments might be concatenated.
10.sql = SELECT '|F:', a.pathFileName FROM Attachments AS a WHERE a.pId={{pId:S}}
..
......
......@@ -140,7 +140,7 @@ class OnString {
$urlParam = KeyValueStringParser::parse($urlParamString, '=', '&', KVP_IF_VALUE_EMPTY_COPY_KEY);
foreach ($urlParam as $key => $value) {
switch (substr($key, 0, 1)) {
switch ($key[0] ?? '') {
case '-':
$rcArgs[$key] = $value;
break;
......
......@@ -23,7 +23,7 @@ class SessionCookie {
/**
* @var array
*/
private $arrCookieString = array();
private $arrQfqPdfCookie = array();
/**
* @var bool
......@@ -38,7 +38,7 @@ class SessionCookie {
* @throws \CodeException
*/
public function __construct(array $config) {
$data = array();
$wkhtml = array();
// In debug mode, keep temporary files
$this->cleanTempFiles = !Support::findInSet(SYSTEM_SHOW_DEBUG_INFO_DOWNLOAD, $config[SYSTEM_SHOW_DEBUG_INFO]);
......@@ -58,12 +58,23 @@ class SessionCookie {
// $data .= $name . "=" . $value . "; domain=$domain; path=$path;\n";
// }
// "expires": 1633622615.629729,
// "size": 16,
// "httpOnly": true,
// "secure": true,
// "session": false,
// "sameSite": "Lax"
// Prepare cookie for wkhtml & qfqpdf
foreach ($_COOKIE as $key => $value) {
$data[] = ['name' => $key, 'value' => $value, 'url' => $domain, 'path' => $path];
$this->arrCookieString[] = "name:$key,value:$value,url:$domain,path:$path";
$wkhtml[] = ['name' => $key, 'value' => $value, 'url' => $domain, 'path' => $path];
// $this->arrCookieString[] = "name:$key,value:$value,url:$domain,path:$path";
// qfqpdf seems to have problems if 'domain' is specified: it hangs by fetching the website. Skip domain.
$this->arrQfqPdfCookie[] = "name:$key,value:$value";
}
file_put_contents($this->pathFileNameCookie, json_encode(['cookies' => $data]), FILE_APPEND);
file_put_contents($this->pathFileNameCookie, json_encode(['cookies' => $wkhtml]), FILE_APPEND);
}
/**
......@@ -95,6 +106,6 @@ class SessionCookie {
* @return string
*/
public function getCookieQfqPdf() {
return '"' . implode('" "', $this->arrCookieString) . '"';
return '"' . implode('" "', $this->arrQfqPdfCookie) . '"';
}
}
\ No newline at end of file
......@@ -182,13 +182,15 @@ class Html2Pdf {
$cmd = $this->config[SYSTEM_CMD_WKHTMLTOPDF] . " $customHeader $cookieOptions $options $urlPrint $filenameEscape";
break;
case DOWNLOAD_MODE_QFQPDF:
// /opt/qfqpdf/qfqpdf "https://www.math.uzh.ch" fileadmin/outputNew.pdf --page-size A5 --margin-top 22mm --template-header "<h1>Test</h1>" -Fb -c "name:firstCookie,value:tunjngvnjfr23r" "name:secondCookie,value:249ur4jn233d"'
// /opt/qfqpdf/qfqpdf "https://www.math.uzh.ch" fileadmin/outputNew.pdf
// --page-size A5 --margin-top 22mm
// --template-header "<h1>Test</h1>"
// -Fb
// -c "name:firstCookie,value:tunjngvnjfr23r" "name:secondCookie,value:249ur4jn233d"'
$customHeader = '';
$options .= ' -b';
$cookieOptions = '--cookies ' . escapeshellarg($this->sessionCookie->getCookieQfqPdf());
$options = '';
$cookieOptions = '';
// $cookieOptions = '--cookies ' . escapeshellarg($this->sessionCookie->getCookieQfqPdf());
$cookieOptions = '-c ' . $this->sessionCookie->getCookieQfqPdf();
$cmd = $this->config[SYSTEM_CMD_QFQPDF] . " $urlPrint $filenameEscape $customHeader $cookieOptions $options";
break;
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment