Crop PDF pages along the spine direction

GL_n · 发表于 2018-12-15 22:58:03

本帖最后由 GL_n 于 2018-12-16 02:24 编辑

Sometimes we need to split or crop double-arranged PDF’s pages along the spine direction (i.e., the middle line) into two new pages, for example, two A4 book pages scanned into a A3 paper, or two A5 book pages arranged in a A4 paper, and so on. Well, there exist some cutting tools to fill the need of cutting double-up pages. However, this cutting/cropping procedure is not automatic, and the repeated cropping-operation is somewhat tedious.

How to get the repeated cropping-manipulations done automatically? Writing a python program is most likely a good choice for such assignment. The following coding is one of such right examples for working well towards cropping double-arranged PDF’s pages, no matter what the page format is.

#coding=utf-8
{' g$ Q& s; K. @2 v5 o( g
from PyPDF2 import PdfFileWriter, PdfFileReader$ k8 b3 G! p: z3 l5 S7 o/ \
from copy import copy! ]# A! B$ q5 b& S" O5 z1 }
from os import listdir : a0 g* x7 \, O- k
import math
+ \! i2 P* y9 s! m( e
+ h- A6 I! D0 X5 S8 ^$ |$ q
def op(pdfInputFileName):
9 S" `3 x+ v+ M/ k3 E/ o2 \
! z1 m8 K# ?. U% e
pdfFileObj = open(pdfInputFileName, 'rb') & X( P ]3 ?) n0 |" {* q4 [
pdfReader = PdfFileReader(pdfFileObj)
- Z, L; B: X* x5 v" P0 m
pdfWriter = PdfFileWriter()
7 { t( q* y* G8 i; u( l; c4 c) [; E
" v0 y2 S$ \. L+ |
for page in [pdfReader.getPage(i) for i in range(pdfReader.getNumPages())]:+ v) u- K# Z6 Y7 Y) T
p = page $ w& B3 Z8 o( h- R% R
q = copy(p) 4 i8 R$ l0 _& l4 k5 e* S
q.mediaBox = copy(p.mediaBox)
" Q+ X" j- N" l
0 s8 B/ N$ A( L: `4 L( t
x_1, x_2 = p.mediaBox.lowerLeft6 ~3 N4 v7 B0 H. C* [1 E
x_3, x_4 = p.mediaBox.upperRight
& A* Q6 A) p3 P w, l
) d% V& @& _; r \
x_1, x_2 = math.floor(x_1), math.floor(x_2)
$ u" }8 @4 ?9 A2 Y! L: G
x_3, x_4 = math.floor(x_3), math.floor(x_4)
0 F) R4 u- z! q; x, }6 w4 v0 u
x_5, x_6 = math.floor(x_3/2), math.floor(x_4/2)
S4 D# Q1 b6 C$ S& \3 r8 A
6 S: j4 h ~9 M$ H
if x_3 < x_4: # If your scanned page is normally presented in Adobe Acrobat this "if" statement can be deleted.
) H9 P. f; c3 _; ~
p = p.rotateClockwise(90)3 | C; s6 y) @
q = q.rotateClockwise(90)0 ~% V" e, z- R
0 j/ K. w* z" G* F4 K) R- o
5 D7 n0 P: L! w* o+ v7 {9 i5 @
if x_3 > x_4: # For editable page 4 U, r0 y8 ^* ^7 D6 I
# vertical cropping along Y-axis(x_5 direction, i.e., cutting X-axis)* Z3 C \! \' I3 ^
) p, ]8 l9 u2 a
p.mediaBox.lowerLeft = (x_1, x_2) # Left part of two-page-rectangle
4 G H" c9 S5 O& C7 y0 S& I( _
p.mediaBox.upperRight = (x_5* 105/100, x_4)
3 X6 E- I, j; ]* o* r6 K- @
: I: _1 R5 b! q+ s& V
q.mediaBox.lowerLeft = (x_5* 95/100, x_2)$ F- t1 L4 d: U7 j8 D8 m+ l: i
q.mediaBox.upperRight = (x_3, x_4) # Right part of two-page-rectangle1 t8 k* u t- r7 n B! R( W
& D3 G7 V& b2 g+ y6 D9 a
else: # For image page" q" Y+ k7 Z5 j6 y T
# vertical cropping along X-axis(x_6 direction, i.e., cutting Y-axis)" m' g/ C+ l5 H/ ?
2 v m1 b4 Q# ^' l7 L4 R$ p
p.mediaBox.lowerLeft = (x_1, x_2)* U: S; _3 j# i& T# `2 S7 m/ P( R
p.mediaBox.upperRight = (x_3, x_6* 105/100) # Left part of two-page-rectangle5 B8 X) @2 F2 M# j% d) q; `
! P! ~7 Y5 Z2 E- n
q.mediaBox.lowerLeft = (x_1, x_6* 95/100)
# `. F2 J7 X' O9 A5 ~
q.mediaBox.upperRight = (x_3, x_4) # Right part of two-page-rectangle! g# Z, E7 P, Y7 ]0 \3 n+ u
- c& O( I$ o- m7 J
pdfWriter.addPage(p)
& j, R6 x7 d6 I q7 f1 A, M: q* y- K
pdfWriter.addPage(q)& R5 N+ A1 e/ x+ R0 i
0 T( F1 }$ ^ c- s! G6 V( U3 m
pdfOutputFileName = pdfInputFileName[:-4]+'-cut_myself_revised.pdf'
; l3 K6 h. W/ I# |" H8 e) O
pdfOutputFile = open(pdfOutputFileName, 'wb')
* I8 P# I# s6 T* w3 I' t
pdfWriter.write(pdfOutputFile)
$ Z" ~, c4 e2 Q1 e4 ^8 e$ C
pdfFileObj.close()
% ^: ^$ ?9 b$ L1 M
pdfOutputFile.close()" a3 D* F/ M! n! w- C3 |$ w' k8 A' y3 N' z
_$ X) a9 k, V& U! R
# Accomplish the whole execution of a series of PDF-cropping (both editable pages and image pages) automatically in current directory. E) w$ U1 p7 E! W" v
for pdfInputFileName in listdir('.'):
4 @( H- y- d, _: g# C
if pdfInputFileName[-4:]=='.pdf' or pdfInputFileName[-4:]=='.PDF':7 n) L4 S+ @/ T- \) g+ P
op(pdfInputFileName)
( X% u! i! |- w: ~$ N4 X& c2 C! b
) |9 }/ Z9 |( F) |' @/ f5 o& }
1 {, b M' k9 |! @4 [

复制代码

greatABC005 · 发表于 2020-2-3 15:05:30

Thanks for your great work.

		自动登录	找回密码
密码			免费注册

[讨论] Crop PDF pages along the spine direction