Community Page
- www.changeforge.com Jump to website »
-
Subscribe -
Community
-
Top Commenters
-
Popular Threads
-
Recent Comments
- Black berry phones had been really great but, Black berry is way to better than iphone to bad only few people can see it.
- Totally. And what's even better is that you don’t have to be a web programmer to manage your SharePoint site.
- Jeff, good to hear from you. I was just listening to a manager tools podcast, and they underscored the same thing - specifically about feedback. We have to deliver unpleasant news sometimes, and if...
- Lawton, good to hear from you again. I happened to catch a BYU speaker not too long ago (on a different subject) but I was rather impressed with the insights there too. Must be something in the...
- Ken, I have struggled with the same issues as a young manager but recently received some inspired direction that has helped me rise above the weakness of self in leadership. In December I attended...
ChangeForge...
Where business and technology collideThe Disadvantages of Microsoft SharePoint 2007 as a Document Management System
Started by ChangeForge | Ken Stewart · 10 months ago
Learn why Microsoft's SharePoint technology just isn't ready to take on document management solutions - just yet...
... Continue reading »
10 months ago
I was wondering why you thought this... if the database houses metadata, index information etc, as well as the image/raw doc why is this a problem... does not the benefit of having a database supply the data integrity for all items better than having to worry about the joys of links/tags to an external data store...
As you may see at DL we have created a Online Document Archving solution (Instant Intelligence Archiving) which we sale via a Channel using a SaaS model, and all the documents/images are stored with a database along (but in a seperate DB) with indexing data (ocred text, index information supplied by the user, etc)... we did this to help with our own DR process... Yes there is a speed issue with getting Blob data out of the DB, but with the speed of processors that exist now this speed hit is becoming less and less of a problem.
I would be very interested in hearing your thoughts....
Kindest Regards
Chris
10 months ago
First, let me qualify that I am not a database engineer or DBA. That aside, I work in a position whereby I have been exposed to a small number of CMS/DMS solutions to include some big names like EMC (Legato) Application xTender, and some smaller ones you probably have never heard of.
So here's my take:
We have 2 differing formats for CMS/DMS prentations: 1) the unstructured and "crawl the sprawl" route (e.g. Google), and 2) the highly structured route as in traditional CMS/DMS offerings. I am focused more on the latter, just to clarify.
Traditionally, metadata is stored within a structured format to increase the transactional return of information - and to increase overall transaction speed and efficency. You even see this in Business Intelligence (BI) software where they are cubing data to help increase the return of large volumes of information. However, in most cases of document management we are not in need of this high a computational load as would an operations company at a billion dollar+ organization. Again, my article was focused more around SMB's - which I would think would be appropriate to your SaaS offering as well (not having looked indepth at the offering).
To clarify, my statement was geared more towards what I consider maintainability of the infrastructure. As you know, text is smaller and can be compressed moreso than binary image files (traditionally TIF, PDF, BMP). As such, thought would indicate searches on raw text should be much faster than having to parse image files.
Second, in maintaining the necessary archives (in an on-premise solution) keeping the image files outside of the database can make for much cleaner backups. Traditionally, backup agents handle backups of raw files (in an NTFS file format for instance) much more cleanly than in very large databases. Usually, the image repository of a CMS/DMS is the largest part of an installation - so making this as flexible as possible is to the benefit of the maintainer.
Third, ability for administrators of the CMS/DMS soluiton to access and maintain images is very key. We have found it much easier to manage documents outside of the database in instances where an image file has gone corrupt (or thought to be corrupt) and we can access the file directly. This usually happens in situations where the originals are often and quickly destroyed once reliability of the system is established. You might argue security as a counterpoint to this, and this is a difficult challenge but one that can be answered generally.
Last, and to harp on the backups, many solutions I've worked with support multiple DB's (e.g. SQL, MySQL, Oracle, DB2, etc.). I have worked with a MySQL version of a databae where the images were stored within the database, and major backup software vendors do not (at the time of my research) make an agent that allows for differential and/or incremental backups, thus making restoration a very dangerous thing - especially in situations where documents are destroyed very soon after initial scan.
I would submit that I am not familiar with IIA architecture or design - and have no doubt CMS/DMS development may one day over come this. At this point, my experience over the last 3 years has led me to this conclusion. This is not completely scientific, but many ECM vendors and experts alike also share my opinion. SharePoint has some limitations outside of this as well, as I have learned in working with one of our Microsoft Gold Certified Partners that recently conducted an indepth study for a worldwide automotive corporation.
Again, this is not to say storing the documents within a database is a bad thing in a SaaS offering. I might enjoy taking a tour of your software as time permits over the next few weeks. I firmly believe both SharePoint and SaaS have a huge role to play in the CMS/DMS space, and I have on-going research to do in these areas.
Obviously, you e-mailed me so I was wondering if you would be agreeable to me posting this conversation thread in Discus comments? If not, I will abide by your wishes and look forward to continuing this conversation.
Thanks for making me think about this,
Ken
10 months ago
Firstly I am more than happy for you to publish this conversation, and also give you permission to edit it as you see fit. From your reply and the articles you have published (the ones that I have read) it would appear you have no vested interest in editing our conversation to change the context of my thoughts…
I would agree with many of your points, if not all of them. At Data Liberation we have worked with images, within our DMS application, but also with our Data Capture (OMR) application, and we moved very quickly to using a SQL database for storing the images after enduring the pain of lost files etc within file systems.
Just to cover a couple of point you raise, as I mentioned in my first email we use two distinct databases one database for meta/index data and one for images/documents. Our system takes what ever it is supplied and stores the file as a blob image in one database, thereby ensuring the integrity of the documents (users can look at this, but can not update it, but can of course add new versions). If we recognise the file type (e.g. most image formats, Word, RTF, TXT, PDF etc) the system will either OCR the document or strip the text of the document out and store this in the other database. Additionally users can add there own metadata to the file. By having the all this text based information in one database we are able to perform queries to the documents very quickly and then retrieve the document only if the users requests it.
As we use MS SQL 2005 (with sights on SQL 2008 before the end of the year) we have the benefit of being able to do incremental backups of either database. In our case we do log backups every 15 minutes on both database giving us almost continuous backup protection, full backups happen over night.
The one advantage that you highlight with regards to direct access to documents, in the situation where the file has been corrupted. This I would agree is much easier with a file system and extremely (in comparison) difficult with a database. My response to this is that by using a database and the additional integrity that a SQL database provides is that it would be very unlikely that a single document would be corrupted, with a great chance that the entire image database becoming corrupted.
I think we can both agree no matter what approach is taken, backups and the backup strategy is vital to any CMS/DMS system. The systems become a hugely valuable resource to a company and the loss or even partial loss of any of the data contained within them can be potentially devastating.
I am very happy, when time permits you, to supply you with any information you would like on iiArc. The only area I would have reservations on is the way that we implement encryption of the uploaded documents/images, but otherwise I would thoroughly enjoy defending our approach.
As you mentioned SaaS and Sharepoint will have a huge impact on the CMS/DMS market in the coming years… It could be easily argued that Sharepoint already has changed the CMS/DMS landscape massively already… and I believe that some of the current bigger players within the SME market will need to change their sales models and products sets to meet the more demanding and much better informed clients that now exist, or run the risk of losing market share and potentially disappearing all together…
Kindest Regards
Chris Morgan
Managing Director
Data Liberation Ltd
10 months ago
With regards to the shifting marketplace, I would whole-heartedly agree. Microsoft, if not by education alone, has shifted the landscape already. I look for the future of CMS/DMS to have many consolidations and many closures... That being said, I would venture to say you are positioning your company very smartly if trends continue.
9 months ago
The big one lies in the fact that Microsoft recommends keeping Document Libraries at 2000 objects or less for performance reasons. Some serious planning needs to take place, especially if you are using it as a repository for scanned documents.
9 months ago
6 months ago
http://technet.microsoft.com/en-us/library/cc26...
I have experience deploying SharePoint to very large corporate law firms that have over 10,000,000 documents and SharePoint performs fine. The Library of Congress also uses SharePoint. SharePoint scales very well.
6 months ago
I have 2 points of contention, to which I would hope you could help me clarify:
1) In my humble opinion, I discourage use of folder structure as this reintroduces the same mess a typical file server might introduce by obfuscating needed files unnecessarily. The object of an EDM is to surface information for easy findability at all levels and by running various queries to return expected search results.
How is it that folder structures within SP aid in "findability"? This would seem to limit it in my humble opinion - especially when the folder structure becomes extremely deep.
2) When dealing with such large numbers of images, how can returned results be faster in SP since images are stored as BLOB's in the DB, vs. traditional EDM solutions that segregate meta data from images (and even OCR tables)?
PS - I hear MS is going to move away from this in 2009?
MSMatt, I would love to hear your opinions of overcoming these challenges as you obviously have some experience in enterprise SP deployments vs what I have worked with in smaller deployments (primarily WSS).
6 months ago
Can anyone confirm?
Thanks.
T.C.
6 months ago
6 months ago
6 months ago
What I am seeing over the past 2 years has been a ramp up of rogue, departmental installations of SP, and now the enterprise is stepping in to determine if/how they can make SP work across the enterprise.
With over $1 billion dollars in MS licensing being sold in this last year, this would indicate MOSS is being actively deployed. Also, keep in mind, this number does not include the instances of WSS deployments (as it is free) and ISV professional services to customize both WSS and MOSS installations.
As I said, though - if you have data to counter these trends I'm seeing, by all means I would love to post here to give a more accurate picture of what might be happening.
Thank you very much for swinging by... I love hearing from readers!
4 months ago
3 months ago
3 months ago
3 months ago
1 week ago
1 week ago
In the conversations I've been having, everyone is scrambling to partner with Microsoft, and it is rumored that in their next release of SharePoint they are leveraging some options like keeping the documents and metadata separate. Given development items like this, Microsoft has always been happy to sell you their solution or allow you to use their solution as the portal and have an EDM vendor simply skin their application for them.
Either way - they win!