[Important News!]
I’ve kicked off a new project/band where I’m creating music with a heavier techno/metal sound called binarywaste. You should check it out by clicking right here!
My good friend Todd Klindt pointed me to this posting written by Mark Jen which showed some sample code to export site content from a SharePoint Content Database directly.
!!! DISCLAIMER !!!!
Everyone who knows me, knows that I am TOTALLY AGAINST doing direct database calls to the SharePoint databases UNLESS it is for complete disaster recovery, or complete offline reporting. This is one of the reasons DeliverPoint will NEVER have any direct SharePoint database calls in it.
For additional information, see the following Microsoft articles:
“Support for changes to the databases that are used by Office Server products and by Windows SharePoint Services”
http://support.microsoft.com/kb/841057/en-us
“SharePoint Database Access”
http://msdn.microsoft.com/en-us/library/bb861829.aspx
Do NOT run this code against Live SharePoint databases.
Ok, now with that out the way…The problem with Mark’s original code, is that it was written for V2 (WSS 2.0/SPS2003), not V3 (WSS 3.0/MOSS 2007). Also, his query is targeted at specific files (such as .doc/.ppt/.etc).
With a couple of slight changes, I updated the query and code to work with a V3 content database, and also just have it export everything.
Since his code was also just a snippit, I’m placing an updated version with complete source here.
// BEGIN SPDBEX.CS using System; using System.Collections.Generic; using System.Text; using System.Data; using System.Data.SqlClient; using System.IO; namespace spdbex { class Program { static void Main(string[] args) { // replace this string with your // Sharepoint content DB connection string string DBConnString = "Server=DATABASESERVER;" + "Database=CONTENTDBNAME;Trusted_Connection=True;"; // create a DB connection SqlConnection con = new SqlConnection(DBConnString); con.Open(); // the query to grab all the files. SqlCommand com = con.CreateCommand(); com.CommandText = "SELECT ad.SiteId, ad.Id, ad.DirName," + " ad.LeafName, ads.Content" + " FROM AllDocs ad, AllDocStreams ads" + " WHERE ad.SiteId = ads.SiteId" + " AND ad.Id = ads.Id" + " AND ads.Content IS NOT NULL" + " Order by DirName"; // execute query SqlDataReader reader = com.ExecuteReader(); while (reader.Read()) { // grab the file’s directory and name string DirName = (string)reader["DirName"]; string LeafName = (string)reader["LeafName"]; // create directory for the file if it doesn’t yet exist if (!Directory.Exists(DirName)) { Directory.CreateDirectory(DirName); Console.WriteLine("Creating directory: " + DirName); } // create a filestream to spit out the file FileStream fs = new FileStream(DirName + "/" + LeafName, FileMode.Create, FileAccess.Write); BinaryWriter writer = new BinaryWriter(fs); // depending on the speed of your network, // you may want to change the buffer size (it’s in bytes) int bufferSize = 1000000; long startIndex = 0; long retval = 0; byte[] outByte = new byte[bufferSize]; // grab the file out of the db one chunk // (of size bufferSize) at a time do { retval = reader.GetBytes(4, startIndex, outByte, 0, bufferSize); startIndex += bufferSize; writer.Write(outByte, 0, (int)retval); writer.Flush(); } while (retval == bufferSize); // finish writing the file writer.Close(); fs.Close(); Console.WriteLine("Finished writing file: " + LeafName); } // close the DB connection and whatnots reader.Close(); con.Close(); } } } // END SPDBEX.CS
Be sure to change the DATABASESERVER and CONTENTDBNAME for the connection string, then simply execute the C# compiler on your server as follows:
%WINDIR%\Microsoft.NET\Framework\v2.0.50727\csc /target:exe /out:spdbex.exe spdbex.cs
The code in Marks’ original form, does not allow you to specify any target directory etc, and just begins exporting in the CURRENT DIRECTORY you’re in for any root site, so be sure to run this from a directory/folder you create to hold the data from the root.
I’ll ping Mark and see if he’ll allow me to take that code and polish it up with other options (Such as site selection, specific folder targeting etc) if he doesn’t want to update it.
HTH
– Keith
Thanks Keith. This works great.
tk
Keith
Loving it buddy !
Thanks for the excellent code – it’s a real life saver!
Hi,
Thanks for this code, it’s been really useful. The only problem I have at the moment is we have quite a few folder paths that are longer than 248 characters. Do you know if there is a workaround to this?
Thanks
The code would need to be adjusted to create a shorter folder path on the file system for each folder found, then create an appropriate reference map that would then be exported (That is if you were still exporting everything in the content database)
Optionally, you could modifiy the Query to and an AND clause to point to a specific Site GUID for the site in question.
Just thoughts at this moment. I’d like to update the code to handle those scenarios, but it’s not something I have time for right at this moment. I’ll make a note of it though.
Hi,
Thanks for that. I have one other question. Currently, my SQL instance for the Sharepoint databases is only accessible via Named Pipes (I just can’t get it to work using TCP/IP).
How would I adapt your code to allow me to either used named pipes or a DSN on the Sharepoint server?
(Unfortunately I’m a complete C# novice)
Hi,
I managed to find some code on the web to allow me to use a DSN. Thanks once again.
My god… you have just saved my life. This is the only way I was able to get the content after upgrading from WSS3 to MOSS 2007 and experiencing complete meltdown. You are a god among men- a gentleman, a saint!
I deleted a site from the site settings…but then hit cancel. From what I can tell the database is intact…can I rebuild my site with the code above
I crashed a sharepoint server during a domain migration but was able to retrieve the files from a DB backup file thanks to your nice script and some tweaking (increased memory and updated query to work by subfolder).
Great work !!!
I will mention you in my next blog article.
Thanks Jehan!
Hi,
Just want to make a note for a sharepoint document recovery software
http://www.kerneldatarecovery.com/sharepoint-recovery.html
this one proves helpful to restore and recover from MDF db’s with rae repair and Live repair and recovery from the database. Might prove helpful when you face sharepoint database corruptions.
An error occurred when I execute spdbex.exe.
“ArgumentException”
How can I fix this error?
I need a bit more context here. the code snippet doesn’t take any arguments.
Do you get a stack trace after the error?
Thanks for this post!
I’m a real newbie at this stuff, I’m trying to learn….but when I run this code i get the following error.
————-
Unhandled Exception: System.Data.SqlClient.SqlException: An error has occurred w
hile establishing a connection to the server. When connecting to SQL Server 200
5, this failure may be caused by the fact that under the default settings SQL Se
rver does not allow remote connections. (provider: Named Pipes Provider, error:
40 – Could not open a connection to SQL Server)
————–
The thing is that sql server is set up for remote connections.
Any ideas???
Steven, apologies for the extremely late reply.
Is this the developer version of SQL Server? If so, you need to run the surface area configuration app.
Thanks for sharing this information with us …its really helpful…
Excellent programming. I must admit I was skeptic about this when I was struggling to get it compiled with pipes. So for the others out there that might have had problems with the pipes like I did… My Scenario..
Server 2008 SP2
SQL 2005 Express
Windows Internal Database (WSS 3.0)
SQL Server Management Studio Express(SMSE)
Open Sharepoint Central Admin. > Application Management > Content Databases.
Click on the Database. Write down its full name or at least enough for you to remember it. (beginning and ending 3 characters of the hash if you left default)
When the database settings page opens, change the status to Offline and check the “Remove Content Database.” Click Ok.
Next you need to attach the database to your SQL Express which has prefered naming scheme for this code.
In SQL SMSE;
Connect to \\.\pipes\MICROSOFT##SSEE\sql\query with the SQL Connection Mgr.
Detach the database in question.
Disconnect from \\.\pipes…
Connect to
Server\SQLEXPRESS (assuming you left default)
Right click on Databases and select Attach. Attach the database from
C:\Windows\SYSMSI\SSEE…
\MSSQL.2005\MSSQL\Data
Continue to your spdbex.cs file from these wonderful lines of code. Thank you Keith for sharing.
Update lines 20/21.
Server=Server\SQLEXPRESS
Database=WSS_Content_…
Compile as instructed.
You will have to navigate your command line to that directory. However the output is optional.
My Command lines:
CD %WINDIR%\Microsoft.NET\Framework\v2.0.50727
csc /target:exe /out:C:\SPDBExport\SharepointSiteNameExport.exe spdbex.cs
I also updated documentation to use this code should I discontinue employment with this company for the next person.
Thanks again Keith.
Best Regards,
Steve K.
Hello Krichie,
I want to introduce about a tool that recovers corrupted documents, inaccessible sharepoint site content as well as corrupted MDF files. You can check this tool from here:
http://www.sharepoint-server-recovery.com/
Having some trouble. I’ve got the thing compiled and when I run it I get the following error from the debugger after the application crashes:
ArgumentException was Unhandled. Path cannot be the empty string or all whitespace.
Any help?
Can you tell me the exact command line options you are using? Also can you show me the stacktrace?
Hello.
I compiled and executed your code. However, I am having some trouble with running the ASPX pages. From where should the pages be accessed to view the actual content? I am getting many errors (the latest is a 404).
Thanks
Um,. Samah. There are no pages in this application, this simply is a console app to extract files from a database.
Hello Richie,
I need to access an entire sharepoint site’s content in order to build a new PHP/MySQL site. I thought your code will help me extract that content. Am I mistaken? If you know of a tool that can get the pages content that would be a life saver. Thanks
Hello,
I was happy to come across this post, because I need to extract all data from a MOSS 2007 content database.
I followed all the steps stated in this post and changde the server- and database name according to our situation.
When I run the exe I get the following error-message:
E:\111>spdbex.exe
Unhandled Exception: System.ArgumentException: Path cannot be the empty string or all whitespace.
at System.IO.Directory.CreateDirectory(String path, DirectorySecurity directorySecurity)
at spdbex.Program.Main(String[] args)
What could I have forgotten or done wrong? can you enlighten me?
Wim, is the directory you’re running it in a local directory? Or a network share? make sure you run it on a local drive.
hello, sorry to respond only now.
I run it on a local drive. To me it seems as if the content is about to be extracted, but cannot be placed in a directory (maybe because it is from a root sourcen in Sharepoint?).
I do appreciate you help, because I (still) need to get all the dat out of the content database.
Kind regards, Wim
Hello Keith,
the problem is solved. There were indeed empty fields in de Dirname column and therefore the script gave the correct error message and couldn’t create a folder.
I did the following to investigate the error message:
From your script I copied the SQL statements and adapted them so I could run it in a query window.
The output showed that in the first five lines the field Dirname contained no data.
I jotted down the ID’s of these lines and added five lines to the SQL query:
AND ad.Id ‘B25E2773-456C-4A29-974A-029E8DF7CCF2’
AND ad.Id ‘9CC53E20-171E-445B-A33B-3B41E6C3A764′
AND ad.Id ’07AA120D-B316-4A08-BF0B-59D786056E26’
AND ad.Id ‘8FF0D288-9D2B-493D-8584-7755DB55A1FD’
AND ad.Id ‘2B32C357-4516-4EC9-9159-A83F0240E6A2’
I ran the query again and the five lines were left out of the output.
Then I edited your script and added the adapted SQL statement:
// the query to grab all the files.
SqlCommand com = con.CreateCommand();
com.CommandText = “SELECT ad.SiteId, ad.Id, ad.DirName,” +
” ad.LeafName, ads.Content” +
” FROM AllDocs ad, AllDocStreams ads” +
” WHERE ad.SiteId = ads.SiteId” +
” AND ad.Id = ads.Id” +
” AND ads.Content IS NOT NULL” +
” AND ad.Id ‘B25E2773-456C-4A29-974A-029E8DF7CCF2’ ” +
” AND ad.Id ‘9CC53E20-171E-445B-A33B-3B41E6C3A764′ ” +
” AND ad.Id ’07AA120D-B316-4A08-BF0B-59D786056E26’ ” +
” AND ad.Id ‘8FF0D288-9D2B-493D-8584-7755DB55A1FD’ ” +
” AND ad.Id ‘2B32C357-4516-4EC9-9159-A83F0240E6A2’ ” +
” Order by DirName”;
After compiling the script I ran it and all the data was wonderfully extracted and put in a file structure.
Many thanks for your help. I hope my findings will be of help to others.
Kind regards,
Wim.
After posting my findings, the text is not rendered correctly. The statements I added had the not equal to (”) sign in them.
” AND ad.Id ” ‘B25E2773-456C-4A29-974A-029E8DF7CCF2’ ” +
etc.
Hello,
Great post! It helped me alot.
Is there a possiblity to also export .docx and .xlsx files?
I should handle that as is Dennis
You saved our bacon with this. Thank you so much!
Awesome, this is exactly what I needed! Thank you!!
thx much for showing the compile command. worked like a charm – saved me from a corrupt config db. whew!
You have made my day! Thanks 🙂
Hi Keith
great script, thanks…is there a chance to update the script for “site/subsite selection” or “site/subsite black/whitelisting”? 🙂
we have a DB >30GB and the script quits after ~5GB with “Shared Memory-Provider, error: 0”
Intressting: if i add more memory to the vm the scripts quits later
4GB Memory: script quits after ~5GB
8GB Memory: script quits after ~10GB
don’t have have any vm with >32GB Memory 😦
Chris, I would recommend just adding to the WHERE clause in the SQL script to include the GUID of your site/web that you want to narrow down on.
Keith, works perfect with GUID…thanks buddy 😉
I have got 1 suggestion for your website. It appears like there are a number of cascading stylesheet problems when opening a number of web pages in google chrome and firefox. It is operating alright in internet explorer. Probably you can double check this.
Make your suggestion to the folks at WordPress, not me.
Thank you…. this has been a life saver after the update killed our sharepoint environment.
This is awesome. Thanks Keith!
Wonderful Script Keith! This is very helpful for our audits. However, we have encountered a situation where this works on a small database but not a large database of 13GB. We receive the following error:
Unhandled Exception: System.Data.SqlClient.SqlException: Timeout expired. The t
imeout period elapsed prior to completion of the operation or the server is not
responding.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolea
n breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObj
ect stateObj)
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cm
dHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, Tds
ParserStateObject stateObj)
at System.Data.SqlClient.SqlDataReader.ConsumeMetaData()
at System.Data.SqlClient.SqlDataReader.get_MetaData()
at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, Run
Behavior runBehavior, String resetOptionsString)
at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBe
havior, RunBehavior runBehavior, Boolean returnStream, Boolean async)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehav
ior, RunBehavior runBehavior, Boolean returnStream, String method, DbAsyncResult
result)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehav
ior, RunBehavior runBehavior, Boolean returnStream, String method)
at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, S
tring method)
at System.Data.SqlClient.SqlCommand.ExecuteReader()
at spdbex.Program.Main(String[] args)
We tried adding the SIDS as well as changing the ASPnet config file to enable exceptions and it still fails on a large database. Any assistance would be greatly appreciated.
Rose, look at setting the ConnectionTimeout property on the SqlConnection in the code: http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection.connectiontimeout.aspx
I got same error as Rose. As coding is nightmare to me, could you add this timeout thingy to the code?
Here is the script adjusted for:
* timeout
* Empty path
* File name and path too long error
* Added more console output
// BEGIN SPDBEX.CS
using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;
using System.IO;
namespace spdbex
{
class Program
{
static void Main(string[] args)
{
// replace this string with your
// Sharepoint content DB connection string
string DBConnString =
“Server=VMM2;” +
“Database=WSS_Sharepoint_Content_80_2010;Trusted_Connection=True;Connection Timeout=30”;
// create a DB connection
SqlConnection con = new SqlConnection(DBConnString);
con.Open();
Console.WriteLine(“Opened DB Connection”);
Console.WriteLine(“Connection Timeout: ” + con.ConnectionTimeout);
// the query to grab all the files.
SqlCommand com = con.CreateCommand();
com.CommandText = “SELECT ad.SiteId, ad.Id, ad.DirName,” +
” ad.LeafName, ads.Content” +
” FROM AllDocs ad, AllDocStreams ads” +
” WHERE ad.SiteId = ads.SiteId” +
” AND ad.Id = ads.Id” +
” AND ads.Content IS NOT NULL” +
” AND ad.DirName IS NOT NULL” +
” Order by DirName”;
Console.WriteLine(“Query:” + com.CommandText);
// execute query
SqlDataReader reader = com.ExecuteReader();
Console.WriteLine(“Executed the query to grab all the files”);
while (reader.Read())
{
// grab the file’s directory and name
string DirName = (string)reader[“DirName”];
string LeafName = (string)reader[“LeafName”];
if (DirName != “” && (DirName.Length + LeafName.Length < 240) )
{
Console.WriteLine("DirName.Length: " + DirName.Length + "LeafName.Length: " + LeafName.Length);
// create directory for the file if it doesn’t yet exist
if (!Directory.Exists(DirName))
{
Directory.CreateDirectory(DirName);
Console.WriteLine("Creating directory: " + DirName);
}
// create a filestream to spit out the file
FileStream fs = new FileStream(DirName + "/" + LeafName,
FileMode.Create, FileAccess.Write);
BinaryWriter writer = new BinaryWriter(fs);
// depending on the speed of your network,
// you may want to change the buffer size (it’s in bytes)
int bufferSize = 1000000;
long startIndex = 0;
long retval = 0;
byte[] outByte = new byte[bufferSize];
// grab the file out of the db one chunk
// (of size bufferSize) at a time
do
{
retval = reader.GetBytes(4, startIndex, outByte, 0,
bufferSize);
startIndex += bufferSize;
writer.Write(outByte, 0, (int)retval);
writer.Flush();
} while (retval == bufferSize);
// finish writing the file
writer.Close();
fs.Close();
Console.WriteLine("Finished writing file: " + LeafName);
}
}
// close the DB connection and whatnots
reader.Close();
con.Close();
}
}
}
// END SPDBEX.CS
Oops, left in my server and DB names so be sure to replace them
Ah, another thing I should point out is that the Too long file name code is brute. It just skips over long files. A more proper and simple handling would be to cut short the names and to stick something that will make it unique at the end like a numerator.
Keith – Thanks!!!
Thanks a lot – You saved me!
Hi, thanks for this really great code. I was able to run it and recover all documents but the site had a Wiki Page Library containing a Home.aspx page. Is there any way to recover it using this method?
Hi,
First of all, thanks for sharing this!
I was asked to make some changes to this code as we needed something very similar (just needed to filter at a specific subsite). I decided to rewrite this into PowerShell as it is easier to change later on without having the need to recompile (and also running .exe files on a farm always seems to be risky — even though you know the code)
If interested, you can find the post here (linked back to this post):
https://jochimsergoyne.wordpress.com/2016/07/26/export-files-directly-from-db/
of course. Just remember I still don’t recommend running it on a “Live” site. Better to take a backup of the database and place it on a recovery box or something.
Keith, I follow all your instructions. I am getting spdbex.exe that is blank, when I open it. Am I missing something.
Here what I replace:
// Sharepoint content DB connection string
string DBConnString =
“Server=ROPH-INTRANET\\MICROSOFT##SSEE;” +
“Database=WSS_Content;Trusted_Connection=True;”;
By “Blank” what do you mean, that nothing is exported?
Yes. The only thing that appear after I compile was spdbex.exe file.
Also, be sure to check out the comment by Jochim, he converted it to PowerShell so that may help with making slight tweaks to it if you need. See here: https://jochimsergoyne.wordpress.com/2016/07/26/export-files-directly-from-db/
Keith, I run the spdbex.exe file. I am getting an error message. An unhandled exception (‘System.Data.SqlClient.SqlException’) occurred in spdbex.exe [8600].
Possible Debuggers:
New instance of Visual Studio 2005.
We are on WSS30 Sharepoint 2003 server.
Yes. The only file that appear was spdbex.exe file.
Yes. The only file appear was the spdbex.exe file.
This is error message I am getting.
An unhandled exception (‘System.Data.SqlClient.SqlException’) occurred in spdbex.exe [7656].
Possible Debuggers:
New instance of Visual Studio 2005
I am using 2003 Server, there no powershell on this server.
what is the stack that is shown for that? Looks like your connection string may not be right, and it’s throwing the exception because of that.
Saviram Lior updated worked. Thanks Keith and Saviram.
Keith, one more question, how can I pull the calendar date?
Not sure what you mean Anthony, and sorry for the late reply… Just now seeing this. Do you mean pull data based on the modified or created date ?
Thank you Keith for putting together this solution. I put together an app based on your original code, added some of the code from the comments, and then a few more things my team needed. Please check the main github page for some quick instructions – https://github.com/blakeja/Wss3ContentRecovery. Here is a direct link to the zip – https://github.com/blakeja/Wss3ContentRecovery/releases/download/v1/recover.zip.