Imported Binary Field - Persistence
Intent
Provide transparent storage of unstructured binary fields containing imported data instead of containing information created using the application.
Motivation
Database applications need to store unstructured binary data sometimes. This data is always handled as a single unit by the business logic, and the business logic performs no or very little processing of the data. Some examples are PDF files and images.
Where should the data be stored? Storing it in the database seems the best choice because the ACID properties of an operation are maintained, and all data of an application is stored in a single location. But there are a few issues with this approach.
Database were not designed for large pieces of data initially. This feature has been added only recently. Support for large objects differs a lot for different databases, and in general the support is poor.
Another problem is that the database gets very large, which may not be desirable. Depending on the implementation of large objects this might influence the performance of the database too.
Another issue is the type of the field containing the data. An array of bytes and an input stream seem obvious choices, but they have a few issues.
Using an array of bytes requires that the complete binary object is kept in memory. This could cause memory problems for server applications that need to handle many requests that have to process binary objects.
Using an input stream prevents having to load the data into memory, but introduces other problems. An input stream can only be used by a single thread, and, depending on the implementation of the stream, the input stream might only be usable once, i.e. it is not possible to seek to the beginning of the stream.
Unstructured binary data that is imported into and never modified by an application is best stored external to the database containing the business objects. By using streams for storing and retrieving information from this external store, the memory problems associated with arrays of bytes are avoided. The external store should make a copy of the data as soon as it is stored, so the input stream used to store an object has to be used only once. If the data is needed again after storing it, then the copy from the external store is used instead of the original.
Implementations
None yet.