Page 5 of 7 FirstFirst 1234567 LastLast
Results 101 to 125 of 157
  1. #101
    Veteran DarrinS's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Jun 2005
    Post Count
    42,561
    Maybe dumb question, but why not straight up copy the original and create a PDF that contains the scanned image. The copier at our office does this with a few clicks and emails you the PDF attachment. Wtf is with all the Illustrator work? Just sayin.

  2. #102
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    Maybe dumb question, but why not straight up copy the original and create a PDF that contains the scanned image. The copier at our office does this with a few clicks and emails you the PDF attachment.
    That's what they did. They just happen to use a fairly high end scanner that layers the image so it's easier to OCR.

    Wtf is with all the Illustrator work? Just sayin.
    There's no Illustrator work. The only guy using Illustrator is the guy on the video. It's a PDF do ent, not an Illustrator do ent.

  3. #103
    Veteran DarrinS's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Jun 2005
    Post Count
    42,561
    That's what they did. They just happen to use a fairly high end scanner that layers the image so it's easier to OCR.



    There's no Illustrator work. The only guy using Illustrator is the guy on the video. It's a PDF do ent, not an Illustrator do ent.


    That's exactly my point. Why try to OCR the doc?

  4. #104
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    The online copy you can download from the WH was exported using the 'Preview' app from MacOS X (using Mac OS X 10.6.7). I would venture the original was probably a multi-layer tiff, and Preview was used to convert to PDF.

  5. #105
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    That's exactly my point. Why try to OCR the doc?
    I think he's suggesting it's a default for the scanner they used.

  6. #106
    Veteran DarrinS's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Jun 2005
    Post Count
    42,561
    I think he's suggesting it's a default for the scanner they used.
    Fair enough.

  7. #107
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    ElNono, thanks for taking the time to explain. Sounds plausible.

    Now, would it be fair to say the original from which the scan was made wouldn't have the artifacts described by the first video such as the Signature with no grayscale pixels?

  8. #108
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    That's exactly my point. Why try to OCR the doc?
    They didn't. There's no OCR text in it. If OCR was done, the PDF would contain the text. What's likely is that they use a background removal function of the scanner (some Fujitsu models have that) so they can place the scan over a generic background.

    The reason you would want to do that is because the background might opaque some of the text.

  9. #109
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    And, are you saying the process took the date and changed some of the characters into black and white with no gray pixels while leaving one number with the grayscale pixels? I think it was the 19 and the 1 that are straight black and white with the 6 being grayscale. I don't recall without going back and re-watching the video.

  10. #110
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    The variations on the actual text don't really matter. The signature could be explained for many reasons: type of pen used, whether it's a single pen trace versus the pen going through multiple time over the line, etc. The machine text looks like a typewriter, which has the same problem.

    What's going to tell you if the do ent is doctored is if the white outline that surrounds the text (and it's on the background layer) matches the text above it. If you notice a pattern on it (signs of the clone tool) or a discontinuation on the gradient then that would raise an alarm. If the guy that made the video would've spotted something like that, it would've been worth looking at.
    The scale stuff is really amateur stuff. Most docs are scanned at 150 or 300 dpi. PDF uses 72dpi as native res. The math isn't that hard.

  11. #111
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    The variations on the actual text don't really matter. The signature could be explained for many reasons: type of pen used, whether it's a single pen trace versus the pen going through multiple time over the line, etc. The machine text looks like a typewriter, which has the same problem.

    What's going to tell you if the do ent is doctored is if the white outline that surrounds the text (and it's on the background layer) matches the text above it. If you notice a pattern on it (signs of the clone tool) or a discontinuation on the gradient then that would raise an alarm. If the guy that made the video would've spotted something like that, it would've been worth looking at.
    The scale stuff is really amateur stuff. Most docs are scanned at 150 or 300 dpi. PDF uses 72dpi as native res. The math isn't that hard.
    I don't know much about .pdf's but I've messed with images on a microscale for a couple of decades. All scans of do ents produce letters or handwriting where the pixels gray out at the edges. The only time I've ever seen a completely black pixeled piece of text or handwriting is when the image was converted to black and white.

    My question was, does the OCR do that to some but not other ink artifacts on scanned do ents?

  12. #112
    Cogito Ergo Sum LnGrrrR's Avatar
    My Team
    Boston Celtics
    Join Date
    Oct 2008
    Post Count
    22,399
    I don't know much about .pdf's but I've messed with images on a microscale for a couple of decades.
    What do you do, if you don't mind me asking?

  13. #113
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    I don't know much about .pdf's but I've messed with images on a microscale for a couple of decades. All scans of do ents produce letters or handwriting where the pixels gray out at the edges. The only time I've ever seen a completely black pixeled piece of text or handwriting is when the image was converted to black and white.

    My question was, does the OCR do that to some but not other ink artifacts on scanned do ents?
    The first reason for that is that you're looking at this on a 72 dpi screen. You're basically compressing pixels to fit the screen, so you're missing basically 3 pixels for each one you see on screen.

    With that in mind, the white outline you see in the background IS that edge grayscale. What happens when you do background removal is you compress the gray colorspace from 256 elements, to, say 200. The first 'whitest' 56 elements (white to light gray) are converted into white with a scaled alpha value. That way the gray contour 'fades' into 'transparent' (or the new background), instead of fading to white. Obviously, you also need to re-scale the remaining opaque image from 200 to 256 colors, which will remove some of the shading and 'wash' it a bit.

    On top of that, the scanner could be set to sharpen the image (don't know that it is in this case). That would enhance the contours, sharpening the text.

  14. #114
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    The first reason for that is that you're looking at this on a 72 dpi screen. You're basically compressing pixels to fit the screen, so you're missing basically 3 pixels for each one you see on screen.

    With that in mind, the white outline you see in the background IS that edge grayscale. What happens when you do background removal is you compress the gray colorspace from 256 elements, to, say 200. The first 'whitest' 56 elements (white to light gray) are converted into white with a scaled alpha value. That way the gray contour 'fades' into 'transparent' (or the new background), instead of fading to white. Obviously, you also need to re-scale the remaining opaque image from 200 to 256 colors, which will remove some of the shading and 'wash' it a bit.

    On top of that, the scanner could be set to sharpen the image (don't know that it is in this case). That would enhance the contours, sharpening the text.
    But, on some and not others?

  15. #115
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    What do you do, if you don't mind me asking?
    The graphics work has been a hobby since computer imaging came around. I've done some logo designs and graphics work for publication.

    None of which is related to my paying job. But, in that capacity, I've manipulated images on a pixel-by-pixel level a number of times.

  16. #116
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    But, on some and not others?
    What do you mean on some and not others? Isn't there a white outline shading to the background around the text?

  17. #117
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    What do you mean on some and not others? Isn't there a white outline shading to the background around the text?
    When he talks about the doctor's signature. The first letter of the name clearly has grayscale pixels that surround the letter. The rest of the signature is black.

  18. #118
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    When he talks about the doctor's signature. The first letter of the name clearly has grayscale pixels that surround the letter. The rest of the signature is black.
    Not enough ink flowing from the pen when he starts writing?

    Does the whole signature has a white outline around that matches?

  19. #119
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    Not enough ink flowing from the pen when he starts writing?

    Does the whole signature has a white outline around that matches?
    It's not an issue with how the ink was applied to the paper. The first letter is an actual scan of the letter. The rest of the signature is a black and white (transparent) image of the rest of the signature.

    Forget the background for a minute because, if you assembled several originals on a transparency and then scanned them with the green background, it would create the white space you describe. That doesn't change the fact that the first letter of the signature is a full color (or at the very least a grayscale) image of the first letter and the remainder is a black and white image.

    The same phenomenon occurs on the date at the bottom and in a couple of other places.

  20. #120
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781

    This is what I'm talking about.

  21. #121
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    It's not an issue with how the ink was applied to the paper. The first letter is an actual scan of the letter. The rest of the signature is a black and white (transparent) image of the rest of the signature.

    Forget the background for a minute because, if you assembled several originals on a transparency and then scanned them with the green background, it would create the white space you describe. That doesn't change the fact that the first letter of the signature is a full color (or at the very least a grayscale) image of the first letter and the remainder is a black and white image.

    The same phenomenon occurs on the date at the bottom and in a couple of other places.
    I can't forget about the background because the background isn't created with transparencies, it's removed digitally by reducing the colorspace resolution, and actually using the image data to do it. You effectively lose levels of gray by doing that. The actual intensity or porosity of the trace of the ink will have an impact in the grayscale levels you see on the scan.

    There's also some form of enhancement done on the text probably to make it OCR easier. It isn't just the signature that is entirely opaque. If you look at the actual text of the form it's also entirely opaque. However, even that text has the properly shaped surrounding white outline.

    In order to doctor any of the black text, you would also need to doctor the surrounding white outline, or it won't match.

  22. #122
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    I can't forget about the background because the background isn't created with transparencies, it's removed digitally by reducing the colorspace resolution, and actually using the image data to do it. You effectively lose levels of gray by doing that. The actual intensity or porosity of the trace of the ink will have an impact in the grayscale levels you see on the scan.

    There's also some form of enhancement done on the text probably to make it OCR easier. It isn't just the signature that is entirely opaque. If you look at the actual text of the form it's also entirely opaque. However, even that text has the properly shaped surrounding white outline.

    In order to doctor any of the black text, you would also need to doctor the surrounding white outline, or it won't match.
    Okay, so will your OCR software do what I just posted?

  23. #123
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    This is what I'm talking about.
    I see what you're talking about. I have the original PDF with me here.

    Notice the white surrounding outline matching the text. You can't 'add that text later' without fixing the outline too.

  24. #124
    🏆🏆🏆🏆🏆 ElNono's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Apr 2007
    Post Count
    153,473
    Okay, so will your OCR software do what I just posted?
    Some laser copiers already do that.

  25. #125
    I don't really care... Yonivore's Avatar
    My Team
    San Antonio Spurs
    Join Date
    Oct 2001
    Post Count
    26,781
    Okay, back to my other questions...to some and not others?



    Why not the 1?

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •