336 image files

Image treatment in book:
Numbering system: page number, image credit number: each page restart numbering
image credits set in sans serif font

Photography credits: 
Lithography? Looks like photographic reproductions (not digital scans). Why -> check with PierreH. 
Probably lithographic work done by/at printer Snoeck-Ducaju & Zoon (Ghent)

so: piece -> photography -> lithography -> scan? film (rasterization, enlargement) -> plate -> paper -> scan -> ? :-)
or: direct ot plate?

Alexia: not a digital camera; scanned negatives





Would there be any films left in the bins of Snoeck?
Design: Bureau Piet Gerards, Heerlen
Is process of binding interesting?

TODO



Scanning process

No specific treatment for images; images not treated separately or: are treated/scanned as text

but: 

nn98-02 = img 3 on pg 98?
nn98-01 = img 2

pagenumbering not padded, so:
    
nnb119-000.jpg nnb120-003.jpg nnb12-000.jpg nnb124-000.jpg nnb128-000.jpg nnb128-001.jpg nnb128-002.jpg nnb129-000.jpg nnb14-000.jpg nnb14-001.jpg nnb143-000.jpg nnb152-000.jpg nnb152-001.jpg nnb152-002.jpg nnb152-003.jpg nnb154-000.jpg
Where  are img on page 12? Are we looking at a different edition? Freya goes  up to check 2nd copy (Delft University Library). Two phyisical copies  are 'identical'
Pagenumbers seem to be more or less correct; shifted by one page now and than

indeed the numbers are all result of the "multicrop" script.. and the references are temporarily lost..
can you give me a link to multicrop script? where does it live?

Check multicrop on this image: http://192.168.1.222/new_babylon/img/nb72-000.jpg
it is also the glueing of the script run different times with different settings cos many times wierd stuff would happen..
and some images are just the full page copied (full spread). so there is like 3 different numberings.. very neat. but i though to rename is just one command but the "history" of the different scripts used was fun to keep for now.


the second one for example doesnt auto-rotate the images..

its really a mess indeed for now... and metadata would be something to look into..
we want to make the pdf searcheable at some point (tesseract) so that might make the process easier

TODO


File analysis

Interesting to 'read back' the process in these actual files.


Image: nb19-003.jpg Format: JPEG (Joint Photographic Experts Group JFIF format) Class: PseudoClass Geometry: 969x1002+0+0 Resolution: 72x72 Print size: 13.4583x13.9167 Units: PixelsPerInch Type: Grayscale Base type: Grayscale Endianess: Undefined Colorspace: Gray Depth: 8-bit Channel depth: gray: 8-bit Channel statistics: Gray: min: 33 (0.129412) max: 255 (1) mean: 177.834 (0.697388) standard deviation: 59.5622 (0.233577) kurtosis: -0.908599 skewness: -0.808636 Colors: 222 Histogram: 1: ( 33, 33, 33) #212121 gray(33,33,33) 2: ( 35, 35, 35) #232323 gray(35,35,35) 2: ( 36, 36, 36) #242424 gray(36,36,36) 3: ( 37, 37, 37) #252525 gray(37,37,37) 4: ( 38, 38, 38) #262626 gray(38,38,38) 14: ( 39, 39, 39) #272727 gray(39,39,39) Colormap: 256 0: ( 0, 0, 0) #000000 gray(0,0,0) 1: ( 1, 1, 1) #010101 gray(1,1,1) 2: ( 2, 2, 2) #020202 gray(2,2,2) 3: ( 3, 3, 3) #030303 gray(3,3,3) 4: ( 4, 4, 4) #040404 gray(4,4,4) 5: ( 5, 5, 5) #050505 gray(5,5,5) Rendering intent: Undefined Gamma: 1 Interlace: None Background color: gray(255,255,255) Border color: gray(223,223,223) Matte color: gray(189,189,189) Transparent color: gray(0,0,0) Compose: Over Page geometry: 969x1002+0+0 Dispose: Undefined Iterations: 0 Compression: JPEG Quality: 95 Orientation: TopLeft Properties: date:create: 2014-07-08T10:43:49+02:00 date:modify: 2014-07-08T10:43:49+02:00 exif:ExifImageLength: 1002 exif:ExifImageWidth: 969 exif:ExifOffset: 90 exif:Orientation: 1 exif:ResolutionUnit: 2 exif:XResolution: 72/1 exif:YResolution: 72/1 jpeg:colorspace: 1 jpeg:sampling-factor: 1x1 signature: d500cec9dd4d1f1adbb0bc1e5ab12486acc6ac4b1f01b56420ad8006d7588a20 Profiles: Profile-exif: 126 bytes Artifacts: filename: nb19-003.jpg verbose: true Tainted: False Filesize: 569KB Number pixels: 971K Pixels per second: 16.18MB User time: 0.060u Elapsed time: 0:01.059 Version: ImageMagick 6.7.7-10 2013-09-10 Q16 http://www.imagemagick.org
http://en.wikipedia.org/wiki/Kurtosis
"any measure of the 'peakedness' of the probability distribution of a real-valued random variable. In a similar way to the concept of skewness, kurtosis is a descriptor of the shape of a probability distribution"


wow.

Maria: There's a randomnes with the images.
No need to reconstruct the book (ie order-relation between text and images)

Constant: starting a painting from the edges. to know about the container

Range, dimensions, formats

$ tesseract nnb98-002.jpg -l eng -psm 1 outfile
Tesseract Open Source OCR Engine v3.02.01 with Leptonica Too few characters. Skipping this page OSD: Weak margin (0.00) for 0 blob text block, but using orientation anyway: 0 Test blob assigned to row at (-817.5,-67.5) on pass 0 Test blob y=(-885,0), row=(-1072.500000,-322.500000), overlap=562.500000 Test blob assigned to row at (-1072.5,-322.5) on pass 4 Test blob y=(-885,0), row=(-1072.500000,-322.500000), overlap=562.500000 Test blob assigned to row at (-1072.5,-322.5) on pass 1
create gifs:

Ideas