• Hi,

    I’m trying to update a media display widget plugin I made some time ago to add TikTok media support.

    The widget scans recent posts for particular types of media using get_media_embedded_in_content() and displays detected media in the widget.

    This works well for YouTube, Video and SoundCloud media (inserted via the relevant builtin WP blocks), but fails to detect similar TikTok content.

    Is this a possible bug in get_media_embedded_in_content(), or is it intended behaviour that TikTok media is not discovered in post content in the same way as other media?

    Development Environment:

    MAMP
    PHP 7.4.21
    Apache
    WordPress 6.1

Viewing 8 replies - 1 through 8 (of 8 total)
  • Moderator bcworkz

    (@bcworkz)

    The function simply locates various HTML embed tags like iframe, embed, object, etc. Other tags can be added to the search criteria through the “media_embedded_in_content_allowed_types” filter.
    https://developer.wordpress.org/reference/hooks/media_embedded_in_content_allowed_types/

    Is TikTok media not embedded by one of the tags listed in the linked doc page? If not, use the filter to add the applicable tag name. If it is one of the tags listed, please provide sample content where the media fails to be found so someone can investigate further.

    Thread Starter toneburst

    (@toneburst)

    Hi,

    thanks very much for getting back to me.

    The function simply locates various HTML embed tags like iframe, embed, object, etc.

    I assumed this was the case.

    I presume the function parses post markup after the oEmbed has done it’s thing, and generated the actual elements used to embed the content, rather than parsing the block markup that’s actually in the database.

    For what it’s worth, here’s the block markup from the Post Editor, with TikTok and YouTube media:

    <!-- wp:embed {"url":"https://www.tiktok.com/@cristopherisrael_actor/video/7146202207426219270","type":"video","providerNameSlug":"tiktok","responsive":true} -->
    <figure class="wp-block-embed is-type-video is-provider-tiktok wp-block-embed-tiktok"><div class="wp-block-embed__wrapper">
    https://www.tiktok.com/@cristopherisrael_actor/video/7146202207426219270
    </div></figure>
    <!-- /wp:embed -->
    
    <!-- wp:embed {"url":"https://www.youtube.com/watch?v=vjD3EVC1-zU","type":"video","providerNameSlug":"youtube","responsive":true,"className":"wp-embed-aspect-4-3 wp-has-aspect-ratio"} -->
    <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
    
    </div></figure>
    <!-- /wp:embed -->

    Here’s the markup generated for the TikTok media by oEmbed, when the post is published on the the site’s frontend:

    <figure class="wp-block-embed is-type-video is-provider-tiktok wp-block-embed-tiktok">
    	<div class="wp-block-embed__wrapper">
    		<blockquote class="tiktok-embed" cite="https://www.tiktok.com/@cristopherisrael_actor/video/7146202207426219270" data-video-id="7146202207426219270" data-embed-from="oembed" style="max-width: 605px;min-width: 325px;" id="v25382748295997190">
    			<iframe style="width: 100%; height: 704px; display: block; visibility: unset; max-height: 704px;" name="__tt_embed__v25382748295997190"
    				sandbox="allow-popups allow-popups-to-escape-sandbox allow-scripts allow-top-navigation allow-same-origin"
    				src="https://www.tiktok.com/embed/v2/7146202207426219270?lang=en-GB&referrer=http%3A%2F%2Flocalhost%2Fwp%2F2022%2F11%2F21%2Ftiktok-content-test%2F&embedFrom=oembed"></iframe>
    		</blockquote>
    		<script async="" src="https://www.tiktok.com/embed.js"></script>
    	</div>
    </figure>

    Here’s the markup for a YouTube video:

    <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio">
    	<div class="wp-block-embed__wrapper">
    		<div class="flex-video flex-video-youtube">
    			<iframe loading="lazy" title="Yazz - The Only Way Is Up" src="https://www.youtube.com/embed/vjD3EVC1-zU?feature=oembed"
    				allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" width="1060" height="795" frameborder="0"></iframe>
    		</div>
    	</div>
    </figure>

    I have noticed one difference that might be significant: it seems that the actual TikTok iFrame element is wrapped inside a blockquote element, whereas it seems to live inside a div, in the case of YouTube (and also Vimeo and SoundCloud) media.

    TikTok Markup:

    <figure>
    	<div>
    		<blockquote>
    			<!-- ACTUAL MEDIA iFRAME -->
    			<iframe></iframe>
    		</blockquote>
    		<!-- ADDITIONAL JAVASCRIPT -->
    		<script async="" src="https://www.tiktok.com/embed.js"></script>
    	</div>
    </figure>

    There’s also a linked JavaScript file. Don’t know if this is significant.

    YouTube Markup:

    <figure>
    	<div>
    		<div>
    			<!-- ACTUAL MEDIA iFRAME -->
    			<iframe></iframe>
    		</div>
    	</div>
    </figure>

    Could this slightly different element tree be confusing the tag pattern-matching somehow?

    • This reply was modified 1 year, 7 months ago by toneburst.
    • This reply was modified 1 year, 7 months ago by toneburst.
    • This reply was modified 1 year, 7 months ago by toneburst.
    • This reply was modified 1 year, 7 months ago by toneburst.
    • This reply was modified 1 year, 7 months ago by toneburst.
    Moderator bcworkz

    (@bcworkz)

    Thanks for the detailed examples. I thought this was something I’d be able to investigate in greater detail, but the regexp syntax used by get_media_embedded_in_content() is beyond my comprehension. AFAICT it simply plucks iframe and other embeds out of content. It should find any iframe no matter what its container or inner content. Presumably you’re using this after oEmbed or else nothing would work. I’m starting to suspect some sort of race condition where TikTok’s oEmbed process is too slow and your function call executes too soon to allow oEmbed to complete.

    Maybe your code could find the original block content prior to oEmbed and run it through oEmbed for itself?

    Thread Starter toneburst

    (@toneburst)

    Thanks for the detailed examples. I thought this was something I’d be able to investigate in greater detail, but the regexp syntax used by get_media_embedded_in_content() is beyond my comprehension.

    Mine, too, I’m afraid.

    AFAICT it simply plucks iframe and other embeds out of content. It should find any iframe no matter what its container or inner content.

    I’d have thought so, too, so your suggestion that the TikTok media iFrame tag isn’t present in the content being parsed at the point that happens makes a lot of sense.

    As I said, oEmbed-generated tags for other media types are successfully extracted from the content.

    Maybe your code could find the original block content prior to oEmbed and run it through oEmbed for itself?

    I was wondering about this, too. I’m unsure how to approach such a task. I guess it would entail rolling my own alternative to get_media_embedded_in_content(). There’s potential for it being very slow I imagine, since it would also need to parse the entire content of multiple post.

    • This reply was modified 1 year, 7 months ago by toneburst.
    Moderator bcworkz

    (@bcworkz)

    “Slow” is relative. Yes, applying preg_match_all() to all of content is “slow”, but it’s impact on overall page speed doesn’t seem to be very significant. After all, it’s what get_media_embedded_in_content() does to apparently little ill effect.

    Completely replacing get_media_embedded_in_content() with your own version would be one solution. The idea is a little daunting in my mind. I’d be inclined to do more of a band-aid solution. Locate just the TikTok embed URLs in source content prior to oEmbed and leave the current scheme in place to find the rest. Run the TiKTok URLs through wp_oembed_get() and merge the resulting HTML with the rest of the embed results.

    My hunch is using preg_match_all() to find only TikTok URLs wouldn’t be all that time consuming in relation to everything else that happens.

    One variation would be to have a function that extracts only TikTok embed HTML after oEmbed as we’d have wanted get_media_embedded_in_content() to do in the first place. This may be easier to implement and a better fit for your current situation.

    Thread Starter toneburst

    (@toneburst)

    Good advice, thank you again!

    I’d have thought so, too, so your suggestion that the TikTok media iFrame tag isn’t present in the content being parsed at the point that happens makes a lot of sense.

    Incidentally, I did a bit more debugging, and was able to confirm the above.

    Dumping the entire content of the first post found, supposedly post-oEmbed, revealed the expected markup for other media types, but only the block markup, straight from the database for the TikTok block, so at the time my plugin’s loop executed, oEmbed for TikTok content indeed hadn’t run.

    This definitely looks like a bug to me.

    • This reply was modified 1 year, 7 months ago by toneburst.
    • This reply was modified 1 year, 7 months ago by toneburst.
    Thread Starter toneburst

    (@toneburst)

    Ah.. I looked again, and I was wrong!

    This is actually the markup produced for the TikTok block:

    <figure class="wp-block-embed is-type-video is-provider-tiktok wp-block-embed-tiktok">
    	<div class="wp-block-embed__wrapper">
    		<blockquote class="tiktok-embed" cite="https://www.tiktok.com/@cristopherisrael_actor/video/7146202207426219270" data-video-id="7146202207426219270" data-embed-from="oembed" style="max-width: 605px;min-width: 325px;">
    			<section>
    				<a target="_blank" title="@cristopherisrael_actor" href="https://www.tiktok.com/@cristopherisrael_actor?refer=embed">@cristopherisrael_actor</a>
    				<p></p>
    				<a target="_blank" title="♬ sonido original - cristopherisraels" href="https://www.tiktok.com/music/sonido-original-7146202307351743238?refer=embed">♬ sonido original – cristopherisraels</a>
    			</section>
    		</blockquote>
    		<script async src="https://www.tiktok.com/embed.js"></script>
    	</div>
    </figure>

    So eEmbed has in fact done its thing! The key point though is that there’s no iFrame!

    I guess the linked JS file is meant to insert the iFrame element itself at page runtime.

    This does explain why get_media_embedded_in_content() is not able to find TikTok media.

    I guess what I need to do now is to work out how to roll a function to extract TikTok media markup, and run that function either if no other media is found, or before I run get_media_embedded_in_content().

    At the moment, my plugin only displays the first media of the specified type(s) found. If I chose to allow multiple media items to be shown, I’d potentially hit an issue where they might appear out-of-sequence. I think I can probably live with that though.

    • This reply was modified 1 year, 7 months ago by toneburst.
    • This reply was modified 1 year, 7 months ago by toneburst.
    • This reply was modified 1 year, 7 months ago by toneburst.
    • This reply was modified 1 year, 7 months ago by toneburst.
    • This reply was modified 1 year, 7 months ago by toneburst.
    Moderator bcworkz

    (@bcworkz)

    You’ll mainly need the right regexp to use in preg_match() (or preg_match_all()). Matching on the TikTok domain name while capturing the surrounding HTML that your plugin needs ought to do the trick. The blockquote and script tags I imagine. If you need help composing the right regexp, I suggest regexr.com or similar “fiddle” sites for regexp. You can put example content into the tool’s text area and see how the regexp you are trying matches or doesn’t match portions of the content.

    Yes, embeds from the other function would each be in their own order. To avoid that I think you’ll need your own version of get_media_embedded_in_content() that does it all in one crazily complex regexp. It’s tempting to live with the out of order situation 🙂

Viewing 8 replies - 1 through 8 (of 8 total)
  • The topic ‘get_media_embedded_in_content() Not Finding TikTok Media’ is closed to new replies.