Exporting site content from a SharePoint Content Database for recovery purposes.

[Important News!]

binarywaste - profile picture - 1000x1000

I’ve kicked off a new project/band where I’m creating music with a heavier techno/metal sound called binarywaste.  You should check it out by clicking right here!

My good friend Todd Klindt pointed me to this posting written by Mark Jen which showed some sample code to export site content from a SharePoint Content Database directly.

!!! DISCLAIMER !!!!

Everyone who knows me, knows that I am TOTALLY AGAINST doing direct database calls to the SharePoint databases UNLESS it is for complete disaster recovery, or complete offline reporting.   This is one of the reasons DeliverPoint will NEVER have any direct SharePoint database calls in it.

For additional information, see the following Microsoft articles:

“Support for changes to the databases that are used by Office Server products and by Windows SharePoint Services”

http://support.microsoft.com/kb/841057/en-us

“SharePoint Database Access”

http://msdn.microsoft.com/en-us/library/bb861829.aspx

Do NOT run this code against Live SharePoint databases.

Ok, now with that out the way…The problem with Mark’s original code, is that it was written for V2 (WSS 2.0/SPS2003), not V3 (WSS 3.0/MOSS 2007).  Also, his query is targeted at specific files (such as .doc/.ppt/.etc).

With a couple of slight changes, I updated the query and code to work with a V3 content database, and also just have it export everything.

Since his code was also just a snippit, I’m placing an updated version  with complete source here.

// BEGIN SPDBEX.CS

using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;
using System.IO;

namespace spdbex
{
    class Program
    {
        static void Main(string[] args)
        {
            // replace this string with your 

            // Sharepoint content DB connection string
            string DBConnString =
             "Server=DATABASESERVER;" +
             "Database=CONTENTDBNAME;Trusted_Connection=True;";

            // create a DB connection
            SqlConnection con = new SqlConnection(DBConnString);
            con.Open();

            // the query to grab all the files.
            SqlCommand com = con.CreateCommand();
            com.CommandText = "SELECT ad.SiteId, ad.Id, ad.DirName," +
                " ad.LeafName, ads.Content" +
                " FROM AllDocs ad, AllDocStreams ads" +
                " WHERE ad.SiteId = ads.SiteId" +
                " AND ad.Id = ads.Id" +
                " AND ads.Content IS NOT NULL" +
                " Order by DirName";

            // execute query
            SqlDataReader reader = com.ExecuteReader();

            while (reader.Read())
            {
                // grab the file’s directory and name
                string DirName = (string)reader["DirName"];
                string LeafName = (string)reader["LeafName"];

                // create directory for the file if it doesn’t yet exist
                if (!Directory.Exists(DirName))
                {
                    Directory.CreateDirectory(DirName);
                    Console.WriteLine("Creating directory: " + DirName);
                }

                // create a filestream to spit out the file
                FileStream fs = new FileStream(DirName + "/" + LeafName,
                    FileMode.Create, FileAccess.Write);
                BinaryWriter writer = new BinaryWriter(fs);

                // depending on the speed of your network,
                // you may want to change the buffer size (it’s in bytes)
                int bufferSize = 1000000;
                long startIndex = 0;
                long retval = 0;
                byte[] outByte = new byte[bufferSize];

                // grab the file out of the db one chunk
                // (of size bufferSize) at a time
                do
                {
                    retval = reader.GetBytes(4, startIndex, outByte, 0,
                        bufferSize);
                    startIndex += bufferSize;

                    writer.Write(outByte, 0, (int)retval);
                    writer.Flush();
                } while (retval == bufferSize);

                // finish writing the file
                writer.Close();
                fs.Close();

                Console.WriteLine("Finished writing file: " + LeafName);
            }

            // close the DB connection and whatnots
            reader.Close();
            con.Close();
        }
    }
}

// END SPDBEX.CS

Be sure to change the DATABASESERVER and CONTENTDBNAME for the connection string, then simply execute the C# compiler on your server as follows:

%WINDIR%\Microsoft.NET\Framework\v2.0.50727\csc /target:exe /out:spdbex.exe spdbex.cs

The code in Marks’ original form, does not allow you to specify any target directory etc, and just begins exporting in the CURRENT DIRECTORY you’re in for any root site, so be sure to run this from a directory/folder you create to hold the data from the root.

I’ll ping Mark and see if he’ll allow me to take that code and polish it up with other options (Such as site selection, specific folder targeting etc) if he doesn’t want to update it.

HTH

– Keith

69 Replies to “Exporting site content from a SharePoint Content Database for recovery purposes.”

  1. Hi,

    Thanks for this code, it’s been really useful. The only problem I have at the moment is we have quite a few folder paths that are longer than 248 characters. Do you know if there is a workaround to this?

    Thanks

  2. The code would need to be adjusted to create a shorter folder path on the file system for each folder found, then create an appropriate reference map that would then be exported (That is if you were still exporting everything in the content database)
    Optionally, you could modifiy the Query to and an AND clause to point to a specific Site GUID for the site in question.

    Just thoughts at this moment. I’d like to update the code to handle those scenarios, but it’s not something I have time for right at this moment. I’ll make a note of it though.

  3. Hi,

    Thanks for that. I have one other question. Currently, my SQL instance for the Sharepoint databases is only accessible via Named Pipes (I just can’t get it to work using TCP/IP).

    How would I adapt your code to allow me to either used named pipes or a DSN on the Sharepoint server?

    (Unfortunately I’m a complete C# novice)

  4. My god… you have just saved my life. This is the only way I was able to get the content after upgrading from WSS3 to MOSS 2007 and experiencing complete meltdown. You are a god among men- a gentleman, a saint!

  5. I crashed a sharepoint server during a domain migration but was able to retrieve the files from a DB backup file thanks to your nice script and some tweaking (increased memory and updated query to work by subfolder).

    Great work !!!
    I will mention you in my next blog article.

  6. I’m a real newbie at this stuff, I’m trying to learn….but when I run this code i get the following error.
    ————-
    Unhandled Exception: System.Data.SqlClient.SqlException: An error has occurred w
    hile establishing a connection to the server. When connecting to SQL Server 200
    5, this failure may be caused by the fact that under the default settings SQL Se
    rver does not allow remote connections. (provider: Named Pipes Provider, error:
    40 – Could not open a connection to SQL Server)
    ————–

    The thing is that sql server is set up for remote connections.

    Any ideas???

  7. Excellent programming. I must admit I was skeptic about this when I was struggling to get it compiled with pipes. So for the others out there that might have had problems with the pipes like I did… My Scenario..

    Server 2008 SP2
    SQL 2005 Express
    Windows Internal Database (WSS 3.0)
    SQL Server Management Studio Express(SMSE)

    Open Sharepoint Central Admin. > Application Management > Content Databases.

    Click on the Database. Write down its full name or at least enough for you to remember it. (beginning and ending 3 characters of the hash if you left default)
    When the database settings page opens, change the status to Offline and check the “Remove Content Database.” Click Ok.

    Next you need to attach the database to your SQL Express which has prefered naming scheme for this code.

    In SQL SMSE;

    Connect to \\.\pipes\MICROSOFT##SSEE\sql\query with the SQL Connection Mgr.
    Detach the database in question.

    Disconnect from \\.\pipes…
    Connect to
    Server\SQLEXPRESS (assuming you left default)
    Right click on Databases and select Attach. Attach the database from
    C:\Windows\SYSMSI\SSEE…
    \MSSQL.2005\MSSQL\Data

    Continue to your spdbex.cs file from these wonderful lines of code. Thank you Keith for sharing.

    Update lines 20/21.
    Server=Server\SQLEXPRESS
    Database=WSS_Content_…

    Compile as instructed.

    You will have to navigate your command line to that directory. However the output is optional.

    My Command lines:
    CD %WINDIR%\Microsoft.NET\Framework\v2.0.50727
    csc /target:exe /out:C:\SPDBExport\SharepointSiteNameExport.exe spdbex.cs

    I also updated documentation to use this code should I discontinue employment with this company for the next person.

    Thanks again Keith.

    Best Regards,
    Steve K.

  8. Having some trouble. I’ve got the thing compiled and when I run it I get the following error from the debugger after the application crashes:

    ArgumentException was Unhandled. Path cannot be the empty string or all whitespace.

    Any help?

  9. Hello.
    I compiled and executed your code. However, I am having some trouble with running the ASPX pages. From where should the pages be accessed to view the actual content? I am getting many errors (the latest is a 404).
    Thanks

      1. Hello Richie,

        I need to access an entire sharepoint site’s content in order to build a new PHP/MySQL site. I thought your code will help me extract that content. Am I mistaken? If you know of a tool that can get the pages content that would be a life saver. Thanks

  10. Hello,

    I was happy to come across this post, because I need to extract all data from a MOSS 2007 content database.

    I followed all the steps stated in this post and changde the server- and database name according to our situation.
    When I run the exe I get the following error-message:
    E:\111>spdbex.exe
    Unhandled Exception: System.ArgumentException: Path cannot be the empty string or all whitespace.
    at System.IO.Directory.CreateDirectory(String path, DirectorySecurity directorySecurity)
    at spdbex.Program.Main(String[] args)

    What could I have forgotten or done wrong? can you enlighten me?

    1. hello, sorry to respond only now.
      I run it on a local drive. To me it seems as if the content is about to be extracted, but cannot be placed in a directory (maybe because it is from a root sourcen in Sharepoint?).

      I do appreciate you help, because I (still) need to get all the dat out of the content database.

      Kind regards, Wim

  11. Hello Keith,

    the problem is solved. There were indeed empty fields in de Dirname column and therefore the script gave the correct error message and couldn’t create a folder.

    I did the following to investigate the error message:
    From your script I copied the SQL statements and adapted them so I could run it in a query window.
    The output showed that in the first five lines the field Dirname contained no data.
    I jotted down the ID’s of these lines and added five lines to the SQL query:
    AND ad.Id ‘B25E2773-456C-4A29-974A-029E8DF7CCF2’
    AND ad.Id ‘9CC53E20-171E-445B-A33B-3B41E6C3A764′
    AND ad.Id ’07AA120D-B316-4A08-BF0B-59D786056E26’
    AND ad.Id ‘8FF0D288-9D2B-493D-8584-7755DB55A1FD’
    AND ad.Id ‘2B32C357-4516-4EC9-9159-A83F0240E6A2’

    I ran the query again and the five lines were left out of the output.

    Then I edited your script and added the adapted SQL statement:
    // the query to grab all the files.
    SqlCommand com = con.CreateCommand();
    com.CommandText = “SELECT ad.SiteId, ad.Id, ad.DirName,” +
    ” ad.LeafName, ads.Content” +
    ” FROM AllDocs ad, AllDocStreams ads” +
    ” WHERE ad.SiteId = ads.SiteId” +
    ” AND ad.Id = ads.Id” +
    ” AND ads.Content IS NOT NULL” +
    ” AND ad.Id ‘B25E2773-456C-4A29-974A-029E8DF7CCF2’ ” +
    ” AND ad.Id ‘9CC53E20-171E-445B-A33B-3B41E6C3A764′ ” +
    ” AND ad.Id ’07AA120D-B316-4A08-BF0B-59D786056E26’ ” +
    ” AND ad.Id ‘8FF0D288-9D2B-493D-8584-7755DB55A1FD’ ” +
    ” AND ad.Id ‘2B32C357-4516-4EC9-9159-A83F0240E6A2’ ” +
    ” Order by DirName”;

    After compiling the script I ran it and all the data was wonderfully extracted and put in a file structure.

    Many thanks for your help. I hope my findings will be of help to others.

    Kind regards,
    Wim.

  12. After posting my findings, the text is not rendered correctly. The statements I added had the not equal to (”) sign in them.

    ” AND ad.Id ” ‘B25E2773-456C-4A29-974A-029E8DF7CCF2’ ” +

    etc.

  13. Hi Keith
    great script, thanks…is there a chance to update the script for “site/subsite selection” or “site/subsite black/whitelisting”? 🙂
    we have a DB >30GB and the script quits after ~5GB with “Shared Memory-Provider, error: 0”
    Intressting: if i add more memory to the vm the scripts quits later
    4GB Memory: script quits after ~5GB
    8GB Memory: script quits after ~10GB
    don’t have have any vm with >32GB Memory 😦

  14. I have got 1 suggestion for your website. It appears like there are a number of cascading stylesheet problems when opening a number of web pages in google chrome and firefox. It is operating alright in internet explorer. Probably you can double check this.

  15. Wonderful Script Keith! This is very helpful for our audits. However, we have encountered a situation where this works on a small database but not a large database of 13GB. We receive the following error:

    Unhandled Exception: System.Data.SqlClient.SqlException: Timeout expired. The t
    imeout period elapsed prior to completion of the operation or the server is not
    responding.
    at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolea
    n breakConnection)
    at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObj
    ect stateObj)
    at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cm
    dHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, Tds
    ParserStateObject stateObj)
    at System.Data.SqlClient.SqlDataReader.ConsumeMetaData()
    at System.Data.SqlClient.SqlDataReader.get_MetaData()
    at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, Run
    Behavior runBehavior, String resetOptionsString)
    at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBe
    havior, RunBehavior runBehavior, Boolean returnStream, Boolean async)
    at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehav
    ior, RunBehavior runBehavior, Boolean returnStream, String method, DbAsyncResult
    result)
    at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehav
    ior, RunBehavior runBehavior, Boolean returnStream, String method)
    at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, S
    tring method)
    at System.Data.SqlClient.SqlCommand.ExecuteReader()
    at spdbex.Program.Main(String[] args)

    We tried adding the SIDS as well as changing the ASPnet config file to enable exceptions and it still fails on a large database. Any assistance would be greatly appreciated.

      1. I got same error as Rose. As coding is nightmare to me, could you add this timeout thingy to the code?

  16. Here is the script adjusted for:
    * timeout
    * Empty path
    * File name and path too long error
    * Added more console output

    // BEGIN SPDBEX.CS

    using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Data;
    using System.Data.SqlClient;
    using System.IO;

    namespace spdbex
    {
    class Program
    {
    static void Main(string[] args)
    {
    // replace this string with your

    // Sharepoint content DB connection string
    string DBConnString =
    “Server=VMM2;” +
    “Database=WSS_Sharepoint_Content_80_2010;Trusted_Connection=True;Connection Timeout=30”;

    // create a DB connection
    SqlConnection con = new SqlConnection(DBConnString);
    con.Open();
    Console.WriteLine(“Opened DB Connection”);
    Console.WriteLine(“Connection Timeout: ” + con.ConnectionTimeout);

    // the query to grab all the files.
    SqlCommand com = con.CreateCommand();
    com.CommandText = “SELECT ad.SiteId, ad.Id, ad.DirName,” +
    ” ad.LeafName, ads.Content” +
    ” FROM AllDocs ad, AllDocStreams ads” +
    ” WHERE ad.SiteId = ads.SiteId” +
    ” AND ad.Id = ads.Id” +
    ” AND ads.Content IS NOT NULL” +
    ” AND ad.DirName IS NOT NULL” +
    ” Order by DirName”;

    Console.WriteLine(“Query:” + com.CommandText);

    // execute query
    SqlDataReader reader = com.ExecuteReader();

    Console.WriteLine(“Executed the query to grab all the files”);

    while (reader.Read())
    {
    // grab the file’s directory and name
    string DirName = (string)reader[“DirName”];
    string LeafName = (string)reader[“LeafName”];

    if (DirName != “” && (DirName.Length + LeafName.Length < 240) )
    {
    Console.WriteLine("DirName.Length: " + DirName.Length + "LeafName.Length: " + LeafName.Length);

    // create directory for the file if it doesn’t yet exist
    if (!Directory.Exists(DirName))
    {
    Directory.CreateDirectory(DirName);
    Console.WriteLine("Creating directory: " + DirName);
    }

    // create a filestream to spit out the file
    FileStream fs = new FileStream(DirName + "/" + LeafName,
    FileMode.Create, FileAccess.Write);
    BinaryWriter writer = new BinaryWriter(fs);

    // depending on the speed of your network,
    // you may want to change the buffer size (it’s in bytes)
    int bufferSize = 1000000;
    long startIndex = 0;
    long retval = 0;
    byte[] outByte = new byte[bufferSize];

    // grab the file out of the db one chunk
    // (of size bufferSize) at a time
    do
    {
    retval = reader.GetBytes(4, startIndex, outByte, 0,
    bufferSize);
    startIndex += bufferSize;

    writer.Write(outByte, 0, (int)retval);
    writer.Flush();
    } while (retval == bufferSize);

    // finish writing the file
    writer.Close();
    fs.Close();

    Console.WriteLine("Finished writing file: " + LeafName);
    }
    }

    // close the DB connection and whatnots
    reader.Close();
    con.Close();
    }
    }
    }

    // END SPDBEX.CS

    1. Ah, another thing I should point out is that the Too long file name code is brute. It just skips over long files. A more proper and simple handling would be to cut short the names and to stick something that will make it unique at the end like a numerator.

  17. Hi, thanks for this really great code. I was able to run it and recover all documents but the site had a Wiki Page Library containing a Home.aspx page. Is there any way to recover it using this method?

  18. Hi,

    First of all, thanks for sharing this!

    I was asked to make some changes to this code as we needed something very similar (just needed to filter at a specific subsite). I decided to rewrite this into PowerShell as it is easier to change later on without having the need to recompile (and also running .exe files on a farm always seems to be risky — even though you know the code)

    If interested, you can find the post here (linked back to this post):
    https://jochimsergoyne.wordpress.com/2016/07/26/export-files-directly-from-db/

  19. Keith, I follow all your instructions. I am getting spdbex.exe that is blank, when I open it. Am I missing something.
    Here what I replace:
    // Sharepoint content DB connection string
    string DBConnString =
    “Server=ROPH-INTRANET\\MICROSOFT##SSEE;” +
    “Database=WSS_Content;Trusted_Connection=True;”;

      1. Keith, I run the spdbex.exe file. I am getting an error message. An unhandled exception (‘System.Data.SqlClient.SqlException’) occurred in spdbex.exe [8600].
        Possible Debuggers:
        New instance of Visual Studio 2005.
        We are on WSS30 Sharepoint 2003 server.

  20. This is error message I am getting.
    An unhandled exception (‘System.Data.SqlClient.SqlException’) occurred in spdbex.exe [7656].
    Possible Debuggers:
    New instance of Visual Studio 2005

    I am using 2003 Server, there no powershell on this server.

  21. Saviram Lior updated worked. Thanks Keith and Saviram.
    Keith, one more question, how can I pull the calendar date?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: