Support » Developing with WordPress » Create PDF media with automaticly content complete and metadata

  • Resolved neodid

    (@neodid)


    Hello,

    When a PDF Media is create (with Upload or plugin “Add from server”) i need to add in content field specific value (texte extract from this PDF) and into a pesonnal Metadata metadata extract from PDF File (like Author).

    On which function can i add a filter ?
    I try to do this for content without success to complete post content of PDF media.
    and i dont know how to do for author metadata..

    add_filter( ‘wp_generate_attachment_metadata’, function( $metadata, $file )
    {
    include ( ‘PdfToText/PdfToText.phpclass’ ) ;
    $pdftext = new PdfToText ( get_attached_file( $file ) ) ;
    // pdftext have a good lalue
    $metadata= array(‘post_content’ => $txttext);
    return $metadata;
    }, 10, 2 );

    thank’s a lot for your help and sorry for my bad english

Viewing 8 replies - 1 through 8 (of 8 total)
  • Moderator bcworkz

    (@bcworkz)

    Your $file parameter is the attachment post ID, not the file path. The file path should be in $metadata['file']. You should not assign a new array to the return value. Any data added should be added or merged with the original data array. Assigning a new array destroys any other established metadata, some of which is required by WP to function correctly.

    Which post’s content are you trying to create? Attachment posts normally do not make use of post_content. And to which meta data is the author info supposed to go, and which author do you mean? The attachment post? The attachment post’s parent? The current user as author? The PDF author?

    Thread Starter neodid

    (@neodid)

    When the media is create, I want automatically complete the content of this media with the text extract from pdf file, and I wand to create a personal metadata on the media with the author attribute from the pdf file.

    • This reply was modified 4 years, 7 months ago by neodid.
    Moderator bcworkz

    (@bcworkz)

    I’m sorry, I don’t understand what you mean when you want to add content and meta data to “the media”. “The media” is the PDF file, which already has content and meta data. You want to add more content and data to the PDF file? From the PDF file? That is illogical. If not the PDF, then where? The attachment post? While this is possible, attachment content is typically ignored. Do you mean the attachment’s parent post? This is possible too, but there is not always a parent post when media is uploaded.

    Thread Starter neodid

    (@neodid)

    It is very nice to help me.

    My objectif is to do à full text search into pdf media
    So I want to upload the pdf file to create a media into wordpress. This media has a title, a content (description) and other properties
    I want to complete the media discribe with the text extraction from pdf.
    So when I use wordpress search it ça search into this describe like à fulltext pdf search.
    I want not create à wordpress post. Only create media and compleat this content (description)

    • This reply was modified 4 years, 7 months ago by neodid.
    • This reply was modified 4 years, 7 months ago by neodid.
    • This reply was modified 4 years, 7 months ago by neodid.
    Moderator bcworkz

    (@bcworkz)

    Ah, OK, I understand now, thanks for explaining. One approach is to save everything in attachment meta data, the text, author, and anything else. Thus everything would be added or merged with the passed data array. The standard WP search only searches titles and content of posts. You would either need to develop a custom PDF search query or alter the default search to include PDF attachment meta data.

    Another approach would be to create a custom post type and assign PDF text to its post_content. Any other extracted meta data could then be stored in this post type’s meta data instead of the attachment. Then altering the default search query would be much easier — you merely need to add the custom post type to the array of post types searched.

    Why not use the attachment to contain the text instead of a custom post type? You could, except then the search process will have to needlessly search a lot of image attachments, simply a waste of resources.

    Thread Starter neodid

    (@neodid)

    Yes, my first idea was to use post_content. And to complete this, I want use a add_filter on a existing function. But I dont find whitch…
    I try wp_generate_attachment_metadata without succeses
    This function must be use by standard upload media but also with other plugin like media fom server (Import files in large quantities).
    Have you an idea to implémentation this function ? (Or other idea 😉

    Moderator bcworkz

    (@bcworkz)

    I would expect ‘wp_generate_attachment_metadata’ to fire for any upload through the media library. To test if your callback is actually called, add a error_log() call as first thing. Judging an action by expected results can be misleading when there are other causes for the expected result to fail.

    Mass media importers are another story. They often bypass the usual WP functions in the name of efficiency. They also often fail to provide useful action hooks that allow for extensibility, so your only recourse is to directly hack the importer or to post process the uploaded media. A one callback fits all solution is very unlikely.

    Thread Starter neodid

    (@neodid)

    Thanks you…this is the good solution….
    Thanks you very mutch for your help

Viewing 8 replies - 1 through 8 (of 8 total)
  • The topic ‘Create PDF media with automaticly content complete and metadata’ is closed to new replies.