• Resolved grexican

    (@grexican)


    I’m exporting a lot of data — thousands of users with hundreds of profile fields. It’s not ideal! And when I was doing so, I was running out of memory (limit set to 256M).

    After a lot of debugging, I found that it was the WP non-persistent cache of BP objects that was filling up the memory. To fix it, I added the following (I added reference lines for context):

    Around line 500, I added $memory_limit and I make sure that the output isn’t buffered (so it keeps spitting data into the file and downloads as fast as possible). Since the cache is what’s growing, I take a look to see if my PHP install is capable of overriding the core cache functions; if so, I blank them out so nothing new gets cached, just while the page is running:

    ob_end_flush(); // no more buffering while spitting back the export data
    
                $memory_limit = $this->return_bytes(ini_get('memory_limit')) * .75;
    
    			// we need to disable caching while exporting because we export so much data that it could blow the memory cache
    			// if we can't override the cache here, we'll have to clear it later...
    			if(function_exists('override_function'))
    			{
    				override_function('wp_cache_add', '$key, $data, $group="", $expire=0', '');
    				override_function('wp_cache_set', '$key, $data, $group="", $expire=0', '');
    				override_function('wp_cache_replace', '$key, $data, $group="", $expire=0', '');
    				override_function('wp_cache_add_non_persistent_groups', '$key, $data, $group="", $expire=0', '');
    			}
    			elseif(function_exists('runkit_function_redefine'))
    			{
    				runkit_function_redefine('wp_cache_add', '$key, $data, $group="", $expire=0', '');
    				runkit_function_redefine('wp_cache_set', '$key, $data, $group="", $expire=0', '');
    				runkit_function_redefine('wp_cache_replace', '$key, $data, $group="", $expire=0', '');
    				runkit_function_redefine('wp_cache_add_non_persistent_groups', '$key, $data, $group="", $expire=0', '');
    			}
    
                //open doc wrapper..
                echo $doc_begin;
    
                // echo headers ##
                echo $pre . implode( $seperator, $headers ) . $breaker;

    Somewhere in the class, I added the function return_bytes:

    function return_bytes($val)
    		{
    			$val = trim($val);
    			$last = strtolower($val[strlen($val)-1]);
    			switch($last) {
    				// The 'G' modifier is available since PHP 5.1.0
    				case 'g':
    					$val *= 1024;
    				case 'm':
    					$val *= 1024;
    				case 'k':
    					$val *= 1024;
    			}
    
    			return $val;
    		}

    Around line 570, where it echos each user row, I added:

    // echo row data ##
                    echo $pre . implode( $seperator, $data ) . $breaker;
    
    				if(memory_get_usage() > $memory_limit)
    				{
    					wp_cache_flush();
    				}
                }

    I’m not sure if the buffer line is necessary. But I can tell you that without the wp_cache_flush line I can’t export more than a few hundred of my users at a time before it runs out of memory. I wish I could clear just the BP stuff, but I didn’t see a way to do that. I weighed my options and decided that clearing my cache but being able to get data was more important than the time spent having to recreate whatever cache I just blew away. If you can come up with a better solution that isolates things, I’d love to hear about it!

    I hope this helps.
    -eli

    https://wordpress.org/plugins/export-user-data/

Viewing 14 replies - 1 through 14 (of 14 total)
  • Plugin Author qstudio

    (@qlstudio)

    Hi Eli,

    This looks like a pretty complete solution to the memory issue.

    The plugin did already feature a export limiter – which would allow you to break the export into smaller batches, and control the rows exported using an offset option.

    If there is a wider need for this solution, I’ll review the code locally and consider inclusion in a future release.

    Cheers

    Q

    Thanks for posting this, I was running into this problem and this helps. I _think_ I never saw it before Buddypress 2.0 came out. The release notes say they changed some things about how caching works with that release. Anyway, nice fix.

    Plugin Author qstudio

    (@qlstudio)

    @grexican,

    In your switch case in the return_bytes method, no matter the entered value the output is multiplied by 1024 ( $val *= 1024 ) – am I missing something?

    Also, the first ob_end_flush() – please explain why that is required?

    Cheers

    Q

    Thread Starter grexican

    (@grexican)

    It’s a fall-through. So in the case of ‘g’ (gigabytes) the value (let’s say 1g) gets multiplied by 1024 * 1024 * 1024. So 1 * 1024 * 1024 * 1024 = # of bytes in a gig. 1 * 1024 = # of megabytes in a gig; again * 1024 = # of kb in a gig; again * 1024 = # of bytes. We need it in bytes because that’s what memory_get_usage() returns.

    ob_end_flush() is for good measure. On low-memory systems, you can fill up the output buffer when you echo things. ob_end_flush() says that each echo is written to the output stream immediately instead of buffering. This takes the weight off the memory limits and also starts dumping data to be downloaded as soon as it’s ready. It’s also slightly slower than buffered streams because of the context switching between the PHP call to write and the actual writing, but the tradeoffs are worth it for file downloading, imho.

    Plugin Author qstudio

    (@qlstudio)

    @grexican – all makes good sense – thanks, I’m reading over all the new options, tidying up and testing things – I might push an update onto git later today – if you’d all be up for testing from there?

    Cheers

    Q

    Thread Starter grexican

    (@grexican)

    Sure, I’ll definitely test it for you, but I unfortunately won’t be able to until next week. I’m about to head out for a few days for a business conference and won’t get back until Saturday.

    Plugin Author qstudio

    (@qlstudio)

    @grexican – No problem, better if I take a bit of time and do more testing.. I’ll push a few changes to Git and you can grab it when you like ( https://github.com/qstudio/export-user-data ) – it’s well out of date there now..

    Thread Starter grexican

    (@grexican)

    Sounds like a plan. I’ve starred the project. Let me know when you’ve updated the repo and I’ll take it for a test run with my larger-than-life export.

    Plugin Author qstudio

    (@qlstudio)

    Ok – first push is up:

    https://github.com/qstudio/export-user-data

    There are a few bits missing and a few changes, mostly to the saving of previous export data:

    – I moved this from the wp_options table to the wp_usermeta – this will allow several users to store data on the same install
    – I moved some of the saving and retrieving options around in the layout and reformatted the methods a little, to match the rest of the plugin
    – I had to comment out the array data sanitizing lines for now, noted with an @todo@cwjordan, please can you take a look at this, as it’s including the saved array as a key within itself?

    I tested a 1500 member export and all the saving and loading functions, but most testing is required before this goes up on WP.

    Any questions, just ask.

    Q

    Thread Starter grexican

    (@grexican)

    Hi Q,

    I’m back from my trip. I’ll try out your plugin changes sometime this week.

    Also, since you’re adding save changes, I think adding a “select all” feature would be very much appreciated.

    In my case, when I wrote this little JS snippet, it didn’t save changes and I didn’t need the fancy selector. So I just used JS to hide it and add a “select all” feature with jQuery. Adding this feature natively would be a big help I think. We have hundreds of fields.

    Here’s my jquery. you should be able to easily adapt it with a static element and then call and “update” to the field picker UI to refresh based on the hidden field’s data:

    setTimeout(function()
    		{
    			$('#bp_fields').css('position', 'static !important').css('left', '');
    			$('#ms-bp_fields').hide();
    
    			var $bpAll = $('<span> <a href="#" onclick="return false;">Select All</a></span>').on('click', function()
    			{
    				$('#bp_fields option').prop('selected', true);
    			});
    
    			$('#bp_fields').after($bpAll);
    
    		}, 50);
    Plugin Author qstudio

    (@qlstudio)

    @grexican – I’ve added a simple “All” or “None” select option for each list – this does not play nicely with the filter yet ( be nice if you could filter, then select all – but perhaps this will never get used anyhow.. ).

    I’ve pushed this change back up to github

    Q

    Plugin Author qstudio

    (@qlstudio)

    @grexican – I’ve got a question, how could I contact you?

    See here for reference: http://wordpress.stackexchange.com/questions/167111/wp-user-query-and-non-unique-usermeta-data

    Cheers

    Q

    Thread Starter grexican

    (@grexican)

    eli dot gassert
    at toad-software dot com

    Plugin Author qstudio

    (@qlstudio)

    Thanks!

Viewing 14 replies - 1 through 14 (of 14 total)
  • The topic ‘HOW TO: Exporting A LOT of Data (Out of Memory Issue)’ is closed to new replies.