Locating and reading Embedded objects inside a ms-word table

Home Forums WordprocessingML Locating and reading Embedded objects inside a ms-word table

Tagged: ,

This topic contains 3 replies, has 2 voices, and was last updated by  vgurunathaprasad 8 years, 5 months ago.

Viewing 4 posts - 1 through 4 (of 4 total)
  • Author
    Posts
  • #3530

    vgurunathaprasad
    Participant

    I have a word document contains images and embedded excels in cells. I tried some programs from our site that extracts embedded objects, but they extracts embedded objects on the whole. But what I am intended to do is that check each cell and figure out the embedded object and its location(what I mean is: inside the cell, at which paragraph the embedded object is.). I need to locate these embedded object because I need to build this document again programatically as it was given.

    screenshot of the document : http://i.stack.imgur.com/jrXKQ.png

    can any one help me please.. Thanks in advance

    #3540

    Eric White
    Keymaster

    It is not quite clear to me what you are trying to do, but I can give some small explanation.

    Every embedded object is in fact inside a paragraph. You can tell what paragraph it is in by finding the w:p element that is the ancestor of the object.

    You can find out the type of object by two pieces of information. First of all, you need to look at the markup in the main document part. The markup for an image will be different from the markup for an embedded spreadsheet. Second, you can know more about the embedded object by looking at the relationship type and the content type for the related part. For example, an embedded spreadsheet will have the following markup:





























    We can see the relationship ID to the part that contains the embedded spreadsheet. The relationship ID is rId5.

    If we use the Open XML Package Editor PowerTool for Visual Studio, we can see that the relationship type is:

    http://schemas.openxmlformats.org/officeDocument/2006/relationships/package

    When we look at the related part, we can see the content type is:

    application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

    If you are unfamiliar with relationship and content types, I recommend watching the screen-casts on Open Packaging Conventions. They are screen-casts #11 and #12 in the following series:

    Introduction to Open XML

    I also recommend watching the other screen-casts in that series.

    You would also benefit, I think, from watching the screen-casts on WordprocessingML.

    #3554

    vgurunathaprasad
    Participant

    Thank You So much sir, I will work on it and echo you back the results ASAP.

    #3606

    vgurunathaprasad
    Participant

    Finally this is what I did,

    string fileName = @"text1.docx";
                using (var document = WordprocessingDocument.Open(fileName, true)) {
                    var docPart = document.MainDocumentPart;
                    var doc = docPart.Document;
    
                    DocumentFormat.OpenXml.Wordprocessing.Table myTable = doc.Body.Descendants<DocumentFormat.OpenXml.Wordprocessing.Table>().First();
    
                    
    
                    foreach (TableRow row in myTable.Elements<TableRow>()) {
                        foreach (TableCell cell in row.Elements<TableCell>()) {
                            
                            List<Paragraph> paras = cell.Descendants<DocumentFormat.OpenXml.Wordprocessing.Paragraph>().ToList();
                            foreach (Paragraph p in paras) {
                                
                                string d = p.InnerXml;
                                if (d.Contains("w:drawing")) {
                                    Console.WriteLine("Dude this is a image");
                                    
    
                                    string[] st = Regex.Split(d,"r:embed=\"");
                                    st = Regex.Split(st[1],"\"");
                                    
                                    string rid = st[0];
                                    var imageData = (ImagePart)docPart.GetPartById(rid);
                                    var stream = imageData.GetStream();
                                    var byteStream = new byte[stream.Length];
                                    int length = (int)stream.Length;
                                    stream.Read(byteStream, 0, length);
                                    string outputFilename = rid;
                                    // Write bytestream to disk
                                    using (var fileStream = new FileStream(outputFilename, FileMode.OpenOrCreate)) {
                                        fileStream.Write(byteStream, 0, length);
                                    }
                                    string result = "";
                                    try {
                                        WebClient client = new WebClient();
                                        client.Credentials = CredentialCache.DefaultCredentials;
                                        byte[] rep = client.UploadFile(@"http://localhost:8080/UploadFileServlet/upload", "POST", outputFilename);
                                        result = System.Text.Encoding.UTF8.GetString(rep);
                                        Console.WriteLine("------------>"+result);
                                        //string webData = client.DownloadString(@"http://localhost:8080/UploadFileServlet/upload");
                                        //Console.WriteLine("######################"+webData);
                                        client.Dispose();
                                    } catch (Exception err) {
                                        Console.WriteLine(err.Message);
                                    }
                                    Console.WriteLine(d);
                                    p.InnerXml = "";
                                    
                                    Paragraph para = p.InsertBeforeSelf(new Paragraph());
                                    Run run = para.AppendChild(new Run());
                                    run.AppendChild(new Text("<img srv='"+result+"' alt='"+result+"' />"));
    
                                   
                                } else {
                                    Console.Write("--n--");
                                }
                            }                  
                        } Console.WriteLine();
                    } Console.WriteLine();
                    
    
                   
                    doc.Save();
                }

    Thank you so n much …
    and sorry for the late reply

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.