I think the email that Chris drafted can go out as-is. It hits issues that are immediately visible and hopefully non-controversial.

While a more general discussion of UUIDs as primary keys may be desirable, I can see it going down a rabbit hole and obscuring the other concerns. Let's treat that as a separable issue.

-Tod

On Jan 12, 2018, at 1:53 PM, Dale Arntson <arnt@uchicago.edu> wrote:

All good points. I was just thinking of the suitability of UUIDs as primary keys for postgres. It may be that the developers feel that UUIDs provide them with maximum flexibility for choosing different data store technologies down the road. I guess, if I could make a more general point, there is a cost involved in not designing your application to maximize the technical capabilities of the platforms that you choose to host it on.

-dale


On 1/12/2018 1:12 PM, Chris Manly wrote:
I'm inclined to not wade into that issue right now, for two reasons:

First, that is an architectural issue that's not specifically within the scope of data migration, and therefore I see it as outside our charge.  It's also pretty pervasive throughout the current imlementation of FOLIO and would require a major restructuring to change at this point. I think it would be a good thing to bring up on discuss.folio somewhere, but I don't think it's appropriate for the SysOps SIG to be raising it as an issue from our perspective.

Second, that analysis (as I read it) is presuming a traditional RDBMS table structure for data storage.  I don't know enough about how the data is stored to be able to judge whether it actually applies in practice to the way data is stored for FOLIO right now (or would be in the future).  Keying off UUID may, in fact, be a very efficient way of storing things in a NoSQL-style document store.

That said, I'm putting both of these assertions in a "this is my personal opinion" category, and if other folks on the SIG think we should take it forward, I won't object.

-- 
Christopher Manly
Director of Library Systems
CU Library Information Technologies
607-255-3344

On January 12, 2018 at 2:03:31 PM, Dale Arntson (arnt@uchicago.edu) wrote:

The technical argument for not using UUIDs should also be mentioned. Primary keys create in-order record layouts where the table rows are just leaves in a btree index that are physically laid out on disk in their logical order. Where you have row insertion by in-order primary keys, rows naturally grow the storage on disk by adding new pages and extents without disturbing the storage for the previously added rows. However, when you use UUIDs as your primary keys, new rows have to by added in a more or less random order. This will cause page splits and merges to occur with every new row added to accommodate the insertion of a new row into the middle of an already optimized storage space for existing rows. This operation is quite expensive when compared with adding a new row at the high-end of a range of primary keys. Also, it is likely to produce sparely populated and fragmented storage that will require table maintenance to fix.

-dale



On 1/12/2018 11:27 AM, Chris Manly wrote:
I'm happy to reach out to Holly and Sharon Wiles-Young.  Would this capture our concerns adequately?


Dear Sharon and Holly,

Two issues came up in our discussions during the SysOp SIG meeting this week that we felt should get a bit of air time in the upcoming developer meeting.  In the absense of a product owner for this area, we wanted to put these ideas in front of you in hopes that they can make their way in front of the right set of developers before the Madrid meeting.  Perhaps these issues have already been explored/addressed, but if not it seemed like a good time to get them out there.

1) Consistency of API interfaces: The question was raised in the SysOps SIG as to whether is any attention being paid to making the batch/bulk API interfaces consistent in their interfaces and conventions.  There's a good bit of effort in making the UI consistent across apps, and while it may not be as critical to make back-end APIs be consistent across, apps, this is one place where it seemed like it would be worth checking.  If the right folks are thinking about it early enough in the process, it shouldn't be that difficult to have the batch/bulk interfaces behave in similar ways.

2) Concerns about the use of UUIDs for primary keys as it pertains to migrations: Currently, our existing ILS systems generally use numeric primary keys for data tables.  As such, we can refer to bib records, patron records, etc. by their numeric IDs fairly reasonably.  Linkages between records similarly rely on those keys.  With FOLIO's use of UUIDs as the primary key, there is going to be a challenge in migration because we won't be able to rely on re-use of those keys as we migrate our data in.  When looking at isolated records, it's not a huge deal.  For example, we could put the Voyager BibID into a field in a metadata record, and be able to refer to it.  But when we need to migrate in the holdings and items that are related to that bib, we need to be able to link the records together properly and we wouldn't be able to use that BibID.  There is an issue of maintaining the referential integrity of the dataset that, while not insurmountable, is made much, much more complex (and potentially error-prone) by the change in primary key.


-- 
Christopher Manly
Director of Library Systems
CU Library Information Technologies
607-255-3344

On January 12, 2018 at 11:51:43 AM, Chris Creswell (ccc2@lehigh.edu) wrote:

I think it's a good idea to send an e-mail to both Sharon (probably Wiles-Young at Lehigh) and Holly about API consistency and UUID usage for primary keys, and sooner rather than later.

-Chris


On 01/12/2018 10:03 AM, Ingolf Kuss wrote:
Hi all,
I completed the notes of yesterday's SysOps meeting, please review them as your time allows:https://wiki.folio.org/display/SYSOPS/2018-01-11+-+System+Operations+and+Management+SIG+Notes

Please read the action items at the bottom of the page.
The last action item is particulaly urgent. I think it will be a little late if we ask Holly or Sharon (which one of the two Sharons?) only after next Thursday's meeting to present the two SysOps issues in Madrid.  Shouldn't I send one of these persons an email today or on Monday and ask her if she can present the issues (or delegate the presentation) ? What do you think about this ? The final list with the issues might be handed over to that person later (on Thursday). What about asking Holly now ?

Best,
Ingolf

Dr. Ingolf Kuss
hbz - Hochschulbibliothekszentrum NRW
Postfach 270451
50510 Köln
Germany
Tel.: (+49) (0) 221 400 75-161
------------------------------------------

------------------------------------------------------



You received this message because you are subscribed to OLE Mailing List



"sysops-sig".



To unsubscribe from this list and stop receiving emails from it, follow



this link: http://archives.simplelists.com.



To post to this group, send email to



sysops-sig@ole-lists.openlibraryfoundation.org



<mailto:sysops-sig@ole-lists.openlibraryfoundation.org>.



Visit this group at



https://ole-lists.openlibraryfoundation.org<https://ole-lists.openlibraryfoundation.org>



.

--   



Christopher Creswell



Library and Technology Services



Sr. Library Systems Analyst



(610) 758-1432



ccc2@lehigh.edu
------------------------------------------------------



You received this message because you are subscribed to OLE Mailing List



"sysops-sig".



To unsubscribe from this list and stop receiving emails from it, follow



this link: http://archives.simplelists.com.



To post to this group, send email to



sysops-sig@ole-lists.openlibraryfoundation.org



<mailto:sysops-sig@ole-lists.openlibraryfoundation.org>.



Visit this group at



https://ole-lists.openlibraryfoundation.org<https://ole-lists.openlibraryfoundation.org>



.
------------------------------------------------------

You received this message because you are subscribed to OLE Mailing List

"sysops-sig".

To unsubscribe from this list and stop receiving emails from it, follow

this link: http://archives.simplelists.com.

To post to this group, send email to

sysops-sig@ole-lists.openlibraryfoundation.org

<mailto:sysops-sig@ole-lists.openlibraryfoundation.org>.

Visit this group at

https://ole-lists.openlibraryfoundation.org<https://ole-lists.openlibraryfoundation.org>

.

------------------------------------------------------

You received this message because you are subscribed to OLE Mailing List

"sysops-sig".

To unsubscribe from this list and stop receiving emails from it, follow

this link: http://archives.simplelists.com.

To post to this group, send email to

sysops-sig@ole-lists.openlibraryfoundation.org

<mailto:sysops-sig@ole-lists.openlibraryfoundation.org>.

Visit this group at

https://ole-lists.openlibraryfoundation.org<https://ole-lists.openlibraryfoundation.org>

.
------------------------------------------------------
You received this message because you are subscribed to OLE Mailing List
"sysops-sig".
To unsubscribe from this list and stop receiving emails from it, follow
this link: http://archives.simplelists.com.
To post to this group, send email to
sysops-sig@ole-lists.openlibraryfoundation.org
<mailto:sysops-sig@ole-lists.openlibraryfoundation.org>.
Visit this group at
https://ole-lists.openlibraryfoundation.org<https://ole-lists.openlibraryfoundation.org>
.

------------------------------------------------------
You received this message because you are subscribed to OLE Mailing List
"sysops-sig".
To unsubscribe from this list and stop receiving emails from it, follow
this link: http://www.simplelists.com/confirm.php?u=DVevCQOg1GOcPlDbN5tI6NqbwwgC4yPO.
To post to this group, send email to
sysops-sig@ole-lists.openlibraryfoundation.org
<mailto:sysops-sig@ole-lists.openlibraryfoundation.org>.
Visit this group at
https://ole-lists.openlibraryfoundation.org<https://ole-lists.openlibraryfoundation.org>
.