Document capture

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Document capture

4D Tech mailing list
Hi all,

I’m beginning to work on a new project (4D v16 on Windows) for a client that handles a LOT of physical documents for their clients. They’ve got a huge storage issue and when they need to refer to a document, they spend huge amounts of time searching the physical files.

I’ve not started prototyping anything yet but I think I’ve got a viable approach. The server will have a shared directory with a sub-directory for each of their clients. There will be a dialog where the user enters information about the document, including a text box where they can enter a brief description of the document. The user would then drag-and-drop a scan of the document onto the description text box and an “on drop” event would trigger a document capture method. This method will have to rename the document (the file-name will be created automatically within 4D without changing the extension), check that the relevant sub-directory exists on the server (and create it if it does not), and then save the renamed file to the server.

If any of you have done something similar, I would really appreciate any feedback on my approach and would welcome any suggestions, pseudo-code, or code that you would be willing to share.

Thanks much,

Ken Geiger
Dolores, CO
[hidden email]

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
Hi Kenneth,

so i guess all documents are scans, aka pictures?
You might want to look at the tesseract plugin by Keisuke, it as fairly usable OCR.
If you have electronic documents, like doc or docx, have a look at automatic doc-pdf conversion, i do it with a shareware called total-doc-converter.
I would advise against using a named folder structure on the server.
keep the structure in data and then just use the 4d tools to store data externally with your own path calculated in the trigger.
example:
doc id: abcd1234def5678
path:

ab
  cd
    12
      34
etc.
keeps content structures mall and has no problem with renaming any of the elements (clients,projects, etc.)
You mileage may vary though.
Good luck


> Am 25.01.2018 um 18:39 schrieb Kenneth Geiger via 4D_Tech <[hidden email]>:
>
> Hi all,
>
> I’m beginning to work on a new project (4D v16 on Windows) for a client that handles a LOT of physical documents for their clients. They’ve got a huge storage issue and when they need to refer to a document, they spend huge amounts of time searching the physical files.
>
> I’ve not started prototyping anything yet but I think I’ve got a viable approach. The server will have a shared directory with a sub-directory for each of their clients. There will be a dialog where the user enters information about the document, including a text box where they can enter a brief description of the document. The user would then drag-and-drop a scan of the document onto the description text box and an “on drop” event would trigger a document capture method. This method will have to rename the document (the file-name will be created automatically within 4D without changing the extension), check that the relevant sub-directory exists on the server (and create it if it does not), and then save the renamed file to the server.
>
> If any of you have done something similar, I would really appreciate any feedback on my approach and would welcome any suggestions, pseudo-code, or code that you would be willing to share.
>
> Thanks much,
>
> Ken Geiger
> Dolores, CO
> [hidden email]
>
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:[hidden email]
> **********************************************************************

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list
There are both services, and software that can do OCR on the scanned documents.

You could (or your client could) investigate this, and then the scanned documents are already text.
This would allow you (and them) to process large volumes of files with minimum or no user input.
This would be MUCH faster.

I imagine something like:
Outside of your system
- scanning system is fed documents
- documents are scanned and OCRed
- the resulting files (one a scanned image the other the text) are placed into specified directory(ies)

in your system (background)
- your software (on the server) picks up the scanned images and places it into some appropriate directory structure, and saves a path to that file in a record, probably the record used in the next step
- the text file is opened, read/imported, and keyworded for searching, file path to scanned image added, and tagged as needed for other references (client ID doc type etc)

In your system (Client side/ UI)
- client enters a search for a doc type, client keyword
- document(s) matching are displayed.
- user can view what they need, or they can download a copy of the scanned image file

Chip

> Hi all,
>
> I’m beginning to work on a new project (4D v16 on Windows) for a
> client that handles a LOT of physical documents for their clients.
> They’ve got a huge storage issue and when they need to refer to a
> document, they spend huge amounts of time searching the physical
> files.
>
> I’ve not started prototyping anything yet but I think I’ve got a
> viable approach. The server will have a shared directory with a
> sub-directory for each of their clients. There will be a dialog where
> the user enters information about the document, including a text box
> where they can enter a brief description of the document. The user
> would then drag-and-drop a scan of the document onto the description
> text box and an “on drop” event would trigger a document capture
> method. This method will have to rename the document (the file-name
> will be created automatically within 4D without changing the
> extension), check that the relevant sub-directory exists on the
> server (and create it if it does not), and then save the renamed file
> to the server.
>
> If any of you have done something similar, I would really appreciate
> any feedback on my approach and would welcome any suggestions,
> pseudo-code, or code that you would be willing to share.
>
> Thanks much,
>
> Ken Geiger
> Dolores, CO
> [hidden email]
>
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:[hidden email]
> **********************************************************************
------------
Hell is other people
     Jean-Paul Sartre
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list
I have built a few document management systems. One thing you need to be
careful of is how many files are in a particular folder. What I did at oine
client is create a structure for folders as follows: Client ID was a long
int. Client id 101123 for example would have the following structure. I
qwould also create a path to each document in a record about that document.

101/1011/101123/documents themselves. That way at each level of directory
structure you can never have more than 999 folders. Each folder would then
have all documents

Client ID 101124 would be
101/1011/101124/documents themselves

for one customer the original system put all images in one directory and it
had over 500,000 documents in it. I chopse this megthod so that web site
could calculate folder structiure and not have to do look ups for every
document

Hope this helps

Regards
Chuck


On Thu, Jan 25, 2018 at 12:39 PM, Kenneth Geiger via 4D_Tech <
[hidden email]> wrote:

> Hi all,
>
> I’m beginning to work on a new project (4D v16 on Windows) for a client
> that handles a LOT of physical documents for their clients. They’ve got a
> huge storage issue and when they need to refer to a document, they spend
> huge amounts of time searching the physical files.
>
> I’ve not started prototyping anything yet but I think I’ve got a viable
> approach. The server will have a shared directory with a sub-directory for
> each of their clients. There will be a dialog where the user enters
> information about the document, including a text box where they can enter a
> brief description of the document. The user would then drag-and-drop a scan
> of the document onto the description text box and an “on drop” event would
> trigger a document capture method. This method will have to rename the
> document (the file-name will be created automatically within 4D without
> changing the extension), check that the relevant sub-directory exists on
> the server (and create it if it does not), and then save the renamed file
> to the server.
>
> If any of you have done something similar, I would really appreciate any
> feedback on my approach and would welcome any suggestions, pseudo-code, or
> code that you would be willing to share.
>
> Thanks much,
>
> Ken Geiger
> Dolores, CO
> [hidden email]
>
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:[hidden email]
> **********************************************************************




--
-----------------------------------------------------------------------------------------
 Chuck Miller Voice: (617) 739-0306 Fax: (617) 232-1064
 Informed Solutions, Inc.
 Brookline, MA 02446 USA Registered 4D Developer
       Providers of 4D, Sybase & SQL Server connectivity
          http://www.informed-solutions.com
-----------------------------------------------------------------------------------------
This message and any attached documents contain information which may be
confidential, subject to privilege or exempt from disclosure under
applicable law.  These materials are intended only for the use of the
intended recipient. If you are not the intended recipient of this
transmission, you are hereby notified that any distribution, disclosure,
printing, copying, storage, modification or the taking of any action in
reliance upon this transmission is strictly prohibited.  Delivery of this
message to any person other than the intended recipient shall not
compromise or waive such confidentiality, privilege or exemption
from disclosure as to this communication.
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list
On Thu, Jan 25, 2018 at 6:39 PM, Kenneth Geiger via 4D_Tech <
[hidden email]> wrote:

>
> If any of you have done something similar, I would really appreciate any
> feedback on my approach and would welcome any suggestions, pseudo-code, or
> code that you would be willing to share.
>

We did something similar,  and I would add few comments:

It seems that modern systems does not have problems with huge amount of
files in folders, so we do only two level structure - one for each client
and inside all client's files.

We store small preview of the document (500 x 500) in data file, compressed
it is few kB, but with lot of documents, database grows pretty fast (we
have instances of ~20GB files.)

We implemented preview of documents of individual client as a (scrollable)
SVG pictures build from data from database - preview, name, size, creation
date etc. Seems to work well even for large number of files, however we had
to handle such action as selection highlighting, D&D etc. We tested picture
arrays displayed by ALP and specialized plugins but this works for us quite
well.

Situation become more complex when user wants to edit stored documents in
third party applications. I have not found a good solution how to find out
when user ended editing. Ended up  displaying small dialog where user
clicks when has finished editing and then code moves document back to
storage.

--
Peter Bozek
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
Yes the O/S is fine but if you ever want to open the folder on a PC or MAC be prepared to wait. The more files the longer the wait. At least that has been my experience

Regards

Chuck

Sent from my iPhone

> On Jan 25, 2018, at 3:35 PM, Peter Bozek via 4D_Tech <[hidden email]> wrote:
>
>
> It seems that modern systems does not have problems with huge amount of
> files in folders, so we do only two level structure - one for each client
> and inside all client's files.

**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

RE: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list
Kenneth,

I have a system that tracks signatures and we get thousands of signatures every day.  We started storing these several years ago and we were originally storing all images in one place.  Our first attempt we stored the .png files as Image fields with the store outside the datafile setting on.  It was so easy to setup and we were off to the races but then we ran into problem.  Next we tried storing all of the images into one OS folder.  Once we got up over about 25k signature .png files in a directory we starting seeing problems.  The main problem here is if you ever need to browse to the directory manually and try and find/fix something the OS freezes for some time.  It gets worse as the number of files grow.  We currently have about 10 million signature images stored and we break it down by several groupings per day now with a few thousand images per directory.  This has solved the majority of our problems.

Regarding storing in the datafile (externally stored)....don't do it if you have a high volume of documents.   Here's a few of the problems I have had with this method.

1.  Backup takes much longer to backup your data as it also has to backup all of your documents stored externally
2.  Your 4D backups can get HUGE
3.  Restoring your backup takes much much longer because it has to write out all of the externally stored documents to disk.  This really really hurt one day when our server crashed and the restore took several hours vs. about 25 minutes since removing the images.  Our datafile is about 70gb of data not counting our signature images.
4.  If something happens and you get a corrupt document on disk or a corrupt directory 4D's compact/verify freaks out because it can't access an externally stored document it wants.  You then have to essentially loop through the datafile and attempt to load each document that is stored on disk and see if you get an error and if you do re-save the record with the document removed from the record.  Without this your compact/verify fails.

I highly recommend segmenting out your documents into smaller chunks of up to a few thousand files per directory.  I also recommend using something like Amazon S3 with geo redundant storage for storage and backup of those documents.  It's more work upfront but can make your life much easier over the long run.  The 4D Method user group had a presentation on using AWS a couple years ago that can help get you on the right track there.



Thanks
Justin
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list
Hi Kenneth,

I’ve also implemented something similar to what others have done.

But in my case, instead of having a folder for each client, I have one folder per clientID\1000 (client ID mod 1000), so a maximum of 1000 folders. Then I have a second folder level for document classes/categories. Thus my folder structure is something like:

- iDoc
— P001
——— L100
——— L200
——— …
— P002
——— L100
——— L200
——— …
— …


I also have the ability to use FTP to transfer files to the server, thus avoiding the need to have a shared folder. One of the reasons was that in many installations, with 100s of thousand of documents, sharing a folder on windows is very inefficient. Thus instead of requiring Clients to share a folder, the documents are sent to the server via FTP. All you need is to enable FTP Server on the 4D Server box, or i can be anywhere. In some cases we have FTP Server together with 4D Server because server also needs access to those documents.

That stuff was initially developed in 4D 2004, with recent versions you could maybe transfer documents to/from server by running methods set a ‘run on server’, that do a BLOB TO DOCUMENT and DOCUMENT TO BLOB. So the documents would be transferred from/to Server by 4D. I have not tried that and thus have no idea how efficient would that be. I have not updated my code as it works fine using FTP, if it ain’t broken…

hth
julio

> On Jan 25, 2018, at 3:39 PM, Kenneth Geiger via 4D_Tech <[hidden email]> wrote:
>
> Hi all,
>
> I’m beginning to work on a new project (4D v16 on Windows) for a client that handles a LOT of physical documents for their clients. They’ve got a huge storage issue and when they need to refer to a document, they spend huge amounts of time searching the physical files.
>
> I’ve not started prototyping anything yet but I think I’ve got a viable approach. The server will have a shared directory with a sub-directory for each of their clients. There will be a dialog where the user enters information about the document, including a text box where they can enter a brief description of the document. The user would then drag-and-drop a scan of the document onto the description text box and an “on drop” event would trigger a document capture method. This method will have to rename the document (the file-name will be created automatically within 4D without changing the extension), check that the relevant sub-directory exists on the server (and create it if it does not), and then save the renamed file to the server.
>
> If any of you have done something similar, I would really appreciate any feedback on my approach and would welcome any suggestions, pseudo-code, or code that you would be willing to share.
>
> Thanks much,
>
> Ken Geiger
> Dolores, CO
> [hidden email]

--
Julio Carneiro
[hidden email]



**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list
Hi Kenneth,

You've gotten a lot of good feedback already. I deal with a lower volume of
documents but high access rates. Last year I moved the actual document
storage to AWS using Bruno LeGay's component. Storing large numbers of
actual documents in 4D isn't really feasible. Even if you use the 'store
outside of datafile' option the backups still include the physical files.
And you start to run into the OS limitations on numbers of files in folders
at some point. I decided to store an index of the documents in 4D,
including the AWS file paths. Within 4D you can access the documents most
easily with a web browser - either the user's system browser or in a web
area. Or you can embed the link in a web page. I do this with images, for
example. And I recommend storing two versions of images as well: a
thumbnail and the actual doc.

In my workflow this is great. Referencing the documents is a snap and
doesn't consume tons of my bandwidth. If the user wants to keep a copy they
just download it. It would be just as easy for you to download the file as
well if you actually need to. And AWS is very cost effective for this sort
of task.

This was a good solution for us. AWS can handle terabytes of documents in a
flash and you can create sophisticated levels of security if you need it.
Regardless of how much storage space you need your 4D datafile remains
manageable.

Kirk Brooks
San Francisco, CA
=======================

*We go vote - they go home*


On Thu, Jan 25, 2018 at 9:39 AM, Kenneth Geiger via 4D_Tech <
[hidden email]> wrote:

> Hi all,
>
> I’m beginning to work on a new project (4D v16 on Windows) for a client
> that handles a LOT of physical documents for their clients. They’ve got a
> huge storage issue and when they need to refer to a document, they spend
> huge amounts of time searching the physical files.
>
> I’ve not started prototyping anything yet but I think I’ve got a viable
> approach. The server will have a shared directory with a sub-directory for
> each of their clients. There will be a dialog where the user enters
> information about the document, including a text box where they can enter a
> brief description of the document. The user would then drag-and-drop a scan
> of the document onto the description text box and an “on drop” event would
> trigger a document capture method. This method will have to rename the
> document (the file-name will be created automatically within 4D without
> changing the extension), check that the relevant sub-directory exists on
> the server (and create it if it does not), and then save the renamed file
> to the server.
>
> If any of you have done something similar, I would really appreciate any
> feedback on my approach and would welcome any suggestions, pseudo-code, or
> code that you would be willing to share.
>
> Thanks much,
>
> Ken Geiger
> Dolores, CO
> [hidden email]
>
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:[hidden email]
> **********************************************************************
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list

> Le 26 janv. 2018 à 15:55, Kirk Brooks via 4D_Tech <[hidden email]> a écrit :
>
> Hi Kenneth,
>
> Even if you use the 'store outside of datafile' option the backups still include the physical files.

It's an exiting feature, but after some tests I gave up. I think it's in the middle of the river. A single query and the whole record is send. Some field options would make it better:
• don't load with record (something like a command 'External field load')
• exclude from backup (very easy IMHO)

--
Arnaud de Montard




**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list
Kirk et al,
no one listens but I'll be the voice of "discent"  :)

AWS, iCloud, Dropbox, Google storage, etc, etc
ALL have the same draw-backs:
- You (your company, your client) no longer have control over these
files.
- Who ELSE is looking at your file(s) -- especially true for a
competitive business environment, or Gov't regulated privacy.
- What are they (the storage company/cloud service) doing with your
files/data?
  -- Google most likely scans ALL files stored on their servers, at a
minimum to target advertising.
- If you care, what about Gov't three letter agencies? (and not just
from the US, but other countries too, China, Russia, the EU, etc etc
etc)
  -- in some cases it has been shown (is believed) that these agencies
can/do give their country's business(es) access to proprietary data
from other companies and other countries.

my 2 cents.

-----
"What are words for when no one listens it's no use talkin at all
I might as well go up and talk to a wall
'cause all the words are having no effect at all..."
- Missing Persons : What are words for



On Fri, 26 Jan 2018 06:55:59 -0800, Kirk Brooks via 4D_Tech wrote:

> Hi Kenneth,
>
> You've gotten a lot of good feedback already. I deal with a lower volume of
> documents but high access rates. Last year I moved the actual document
> storage to AWS using Bruno LeGay's component. Storing large numbers of
> actual documents in 4D isn't really feasible. Even if you use the 'store
> outside of datafile' option the backups still include the physical files.
> And you start to run into the OS limitations on numbers of files in folders
> at some point. I decided to store an index of the documents in 4D,
> including the AWS file paths. Within 4D you can access the documents most
> easily with a web browser - either the user's system browser or in a web
> area. Or you can embed the link in a web page. I do this with images, for
> example. And I recommend storing two versions of images as well: a
> thumbnail and the actual doc.
>
> In my workflow this is great. Referencing the documents is a snap and
> doesn't consume tons of my bandwidth. If the user wants to keep a copy they
> just download it. It would be just as easy for you to download the file as
> well if you actually need to. And AWS is very cost effective for this sort
> of task.
>
> This was a good solution for us. AWS can handle terabytes of documents in a
> flash and you can create sophisticated levels of security if you need it.
> Regardless of how much storage space you need your 4D datafile remains
> manageable.
>
> Kirk Brooks
> San Francisco, CA
> =======================
>
> *We go vote - they go home*
>
>
> On Thu, Jan 25, 2018 at 9:39 AM, Kenneth Geiger via 4D_Tech <
> [hidden email]> wrote:
>
>> Hi all,
>>
>> I’m beginning to work on a new project (4D v16 on Windows) for a client
>> that handles a LOT of physical documents for their clients. They’ve got a
>> huge storage issue and when they need to refer to a document, they spend
>> huge amounts of time searching the physical files.
>>
>> I’ve not started prototyping anything yet but I think I’ve got a viable
>> approach. The server will have a shared directory with a sub-directory for
>> each of their clients. There will be a dialog where the user enters
>> information about the document, including a text box where they can enter a
>> brief description of the document. The user would then drag-and-drop a scan
>> of the document onto the description text box and an “on drop”
>> event would
>> trigger a document capture method. This method will have to rename the
>> document (the file-name will be created automatically within 4D without
>> changing the extension), check that the relevant sub-directory exists on
>> the server (and create it if it does not), and then save the renamed file
>> to the server.
>>
>> If any of you have done something similar, I would really appreciate any
>> feedback on my approach and would welcome any suggestions, pseudo-code, or
>> code that you would be willing to share.
>>
>> Thanks much,
>>
>> Ken Geiger
>> Dolores, CO
>> [hidden email]
>>
>> **********************************************************************
>> 4D Internet Users Group (4D iNUG)
>> FAQ:  http://lists.4d.com/faqnug.html
>> Archive:  http://lists.4d.com/archives.html
>> Options: http://lists.4d.com/mailman/options/4d_tech
>> Unsub:  mailto:[hidden email]
>> **********************************************************************
> **********************************************************************
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:[hidden email]
> **********************************************************************
---------------
Gas is for washing parts
Alcohol is for drinkin'
Nitromethane is for racing
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list
Arnaud,
On Fri, Jan 26, 2018 at 7:34 AM, Arnaud de Montard via 4D_Tech <
[hidden email]> wrote:
>
> > Le 26 janv. 2018 à 15:55, Kirk Brooks via 4D_Tech <[hidden email]>
> a écrit :
> > Even if you use the 'store outside of datafile' option the backups still
> include the physical files.
>
>  Some field options would make it better:
> • don't load with record (something like a command 'External field load')
>
​My understanding is this is the way it works already.
​Have you seen otherwise? ​




> • exclude from backup (very easy IMHO)
> ​
>
​I thought about this too and concluded it's probably not a good idea. The
reason being it creates two 'classes' of data and that's not good. If
something is part of the datafile then it's a full citizen and treated
accordingly. To create this second class of data, which may or may not be
backed up and therefore may or may not be accurate introduces a lot of
complexity to the idea of a 'backup' and makes it impossible to guarantee a
restore is accurate.

​I think the 'store outside the datafile' option is mis-used ​

​if you don't want it backed up. So the sorts of solutions we are
discussing really are the better approach. ​

​I don't recall where but at some point I was part of conversation with
someone from 4D talking about this external storage option and the idea for
it was all about improving query speed - which is why I say I believe the
external data are not returned with the record unless you reference that
field directly. It's not about storage optimization but query optimization.

Kirk Brooks
San Francisco, CA
=======================

*We go vote - they go home*
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list

> Le 26 janv. 2018 à 17:12, Chip Scheide via 4D_Tech <[hidden email]> a écrit :
>
> Kirk et al,
> no one listens but I'll be the voice of "discent"  :)

I understand what you mean… I don't like the idea of having my documents in iCloud, for example. I was very interested in such a storage after Bruno Legay's conference, 2 years ago, it solved a lot of problems I had. But in France you can't store documents with medical information abroad, and even in France it's submitted to very strict rules - excluding AWS. The problem is not to store or not in the cloud, but who does it and under what conditions…  

--
Arnaud de Montard




**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list

> Le 26 janv. 2018 à 18:49, Kirk Brooks via 4D_Tech <[hidden email]> a écrit :
>
> Arnaud,
> On Fri, Jan 26, 2018 at 7:34 AM, Arnaud de Montard via 4D_Tech <
> [hidden email]> wrote:
>>
>> Some field options would make it better:
>> • don't load with record (something like a command 'External field load')
>>
> ​My understanding is this is the way it works already.
> ​Have you seen otherwise? ​

Things may have changed, but last time I tried, LOAD RECORD would grab the whole stuff (and I suppose it could be considered as a bug if not). I store photographs, for example, in 3 different resolutions, that external storage will only reduce my 4dd size, nothing else. With my own external storage I don't have this problem.

>> • exclude from backup (very easy IMHO)
>> ​
> ​I thought about this too and concluded it's probably not a good idea. The
> reason being it creates two 'classes' of data and that's not good.

Yes, perfectly true. But it's exactly the same if I create create my own "home made" external storage, and I don't ask 4D to care about it. Makes me think of the backup file: I never understood why a 4D backup must be a file instead of a folder containing what I need to restart immediately (4dd, 4dindx, etc). No, I have to _extract_ the monster first. Sometimes I wonder if they know we are adults.

--
Arnaud de Montard




**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
Arnaud,

On Fri, Jan 26, 2018 at 10:28 AM, Arnaud de Montard via 4D_Tech <
[hidden email]> wrote:

>
> > Le 26 janv. 2018 à 18:49, Kirk Brooks via 4D_Tech <[hidden email]>
> a écrit :
> >
> > Arnaud,
> > On Fri, Jan 26, 2018 at 7:34 AM, Arnaud de Montard via 4D_Tech <
> > [hidden email]> wrote:
> >>
> >> Some field options would make it better:
> >> • don't load with record (something like a command 'External field
> load')
> >>
> > ​My understanding is this is the way it works already.
> > ​Have you seen otherwise? ​
>
> Things may have changed, but last time I tried, LOAD RECORD would grab the
> whole stuff (and I suppose it could be considered as a bug if not).


​Hmm, that's what I would expect from LOAD RECORD. The file/data is part of
the record, after all. But it wouldn't be the case if I were simply
referencing a field in that record from a linked record. For instance let's
say there is a table name [Photo] where the image file is stored. If I have
a form that displays a record in [Photographer] which links to [Photo]Title
it's my understanding simply referencing Title wouldn't load the entire
record.

I believe queries are optimized to only load the data stored in the record.


Kirk Brooks
San Francisco, CA
=======================

*We go vote - they go home*
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list

> Le 26 janv. 2018 à 20:15, Kirk Brooks via 4D_Tech <[hidden email]> a écrit :
>
> I believe queries are optimized to only load the data stored in the record.

Queries are optimized to search where you tell them to search: 4D will try to use index, but records and even external data can be used if the query needs to do so.

--
Arnaud de Montard




**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************
Reply | Threaded
Open this post in threaded view
|

Re: Document capture

4D Tech mailing list
In reply to this post by 4D Tech mailing list

> On 27 Jan 2018, at 03:12 AEDT, Chip Scheide via 4D_Tech <[hidden email]> wrote:
>
> CLOUD Warnings

I couldn’t agree more, but are they listening?

Cloud is another buzz-hype and I get the impression the very companies offering such service pushing it to be able to do exactly what you described.

Cloud-storage is convenient, but everything pumped up to it has to be encrypted!

my 2¢

Cheers
Jörg
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[hidden email]
**********************************************************************