My good friend Todd Klindt pointed me to this posting written by Mark Jen which showed some sample code to export site content from a SharePoint Content Database directly.
!!! DISCLAIMER !!!!
Everyone who knows me, knows that I am TOTALLY AGAINST doing direct database calls to the SharePoint databases UNLESS it is for complete disaster recovery, or complete offline reporting. This is one of the reasons DeliverPoint will NEVER have any direct SharePoint database calls in it.
For additional information, see the following Microsoft articles:
“Support for changes to the databases that are used by Office Server products and by Windows SharePoint Services”
http://support.microsoft.com/kb/841057/en-us
“SharePoint Database Access”
http://msdn.microsoft.com/en-us/library/bb861829.aspx
Do NOT run this code against Live SharePoint databases.
Ok, now with that out the way…The problem with Mark’s original code, is that it was written for V2 (WSS 2.0/SPS2003), not V3 (WSS 3.0/MOSS 2007). Also, his query is targeted at specific files (such as .doc/.ppt/.etc).
With a couple of slight changes, I updated the query and code to work with a V3 content database, and also just have it export everything.
Since his code was also just a snippit, I’m placing an updated version with complete source here.
// BEGIN SPDBEX.CS
using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;
using System.IO;
namespace spdbex
{
class Program
{
static void Main(string[] args)
{
// replace this string with your
// Sharepoint content DB connection string
string DBConnString =
"Server=DATABASESERVER;" +
"Database=CONTENTDBNAME;Trusted_Connection=True;";
// create a DB connection
SqlConnection con = new SqlConnection(DBConnString);
con.Open();
// the query to grab all the files.
SqlCommand com = con.CreateCommand();
com.CommandText = "SELECT ad.SiteId, ad.Id, ad.DirName," +
" ad.LeafName, ads.Content" +
" FROM AllDocs ad, AllDocStreams ads" +
" WHERE ad.SiteId = ads.SiteId" +
" AND ad.Id = ads.Id" +
" AND ads.Content IS NOT NULL" +
" Order by DirName";
// execute query
SqlDataReader reader = com.ExecuteReader();
while (reader.Read())
{
// grab the file’s directory and name
string DirName = (string)reader["DirName"];
string LeafName = (string)reader["LeafName"];
// create directory for the file if it doesn’t yet exist
if (!Directory.Exists(DirName))
{
Directory.CreateDirectory(DirName);
Console.WriteLine("Creating directory: " + DirName);
}
// create a filestream to spit out the file
FileStream fs = new FileStream(DirName + "/" + LeafName,
FileMode.Create, FileAccess.Write);
BinaryWriter writer = new BinaryWriter(fs);
// depending on the speed of your network,
// you may want to change the buffer size (it’s in bytes)
int bufferSize = 1000000;
long startIndex = 0;
long retval = 0;
byte[] outByte = new byte[bufferSize];
// grab the file out of the db one chunk
// (of size bufferSize) at a time
do
{
retval = reader.GetBytes(4, startIndex, outByte, 0,
bufferSize);
startIndex += bufferSize;
writer.Write(outByte, 0, (int)retval);
writer.Flush();
} while (retval == bufferSize);
// finish writing the file
writer.Close();
fs.Close();
Console.WriteLine("Finished writing file: " + LeafName);
}
// close the DB connection and whatnots
reader.Close();
con.Close();
}
}
}
// END SPDBEX.CS
Be sure to change the DATABASESERVER and CONTENTDBNAME for the connection string, then simply execute the C# compiler on your server as follows:
%WINDIR%\Microsoft.NET\Framework\v2.0.50727\csc /target:exe /out:spdbex.exe spdbex.cs
The code in Marks’ original form, does not allow you to specify any target directory etc, and just begins exporting in the CURRENT DIRECTORY you’re in for any root site, so be sure to run this from a directory/folder you create to hold the data from the root.
I’ll ping Mark and see if he’ll allow me to take that code and polish it up with other options (Such as site selection, specific folder targeting etc) if he doesn’t want to update it.
HTH
- Keith