WordPress.org

Ready to get started?Download WordPress

Forums

help: convert website from static HTML to wordpress pages (8 posts)

  1. oats
    Member
    Posted 6 years ago #

    Hi,
    I have a website that I would like to convert to be powered by WordPress. The site is static HTML pages, generated by dreamweaver, complete with table-based layouts and all!

    I would like to somehow automate extracting content from the static HTML pages, and create WordPress pages. Eventually, I would like to re-create the sitemap, with pages and sub-pages, but perhaps I can't automate that.

    So I guess I am looking for a tool which allows me to extract content, and perhaps page title information from static HTML, and then create WordPress pages. There are too many pages to do this by hand with copy and paste.

    Any advice?

  2. Samuel Wood (Otto)
    Tech Ninja
    Posted 6 years ago #

    Any advice?

    Learn to program, and then write a tool to extract the content from your pages.

    Only you have your pages. Only you can create a tool to extract content from them. There is no such thing as some magical way to extract content from arbitrary data.

  3. oats
    Member
    Posted 6 years ago #

    Otto,
    thanks for your suggestion, that was my backup plan. I know how to program actually, but figured this is a common problem that perhaps someone has solved before, perhaps even an open source tool may be available.

    I could do web programming with PHP or perhaps use C++ or some scripting language like Java.

    How have other people solved this?

  4. oats
    Member
    Posted 6 years ago #

    Also,
    extracting data is one step - but then how to automate creation of wordpress pages?

  5. moshu
    Member
    Posted 6 years ago #

  6. apzc2529
    Member
    Posted 6 years ago #

  7. venik4
    Member
    Posted 5 years ago #

    Unfortunately, there is no magic bullet for converting static HTML to WordPress. Much depends on the complexity and the layout of your current static site. What you need is a script that will read your original HTML files, extract page title, file timestamp, description or excerpt, and body content.

    There is a list of allowed HTML tags in WordPress posts: http://faq.wordpress.com/2006/06/08/allowed-html-tags/ You will need to parse the body of your pages to get rid of any tags that are not on this list. Then your script will need to connect to MySQL and insert this data in all the appropriate tables. Perl is probably the best tool for parsing stuff like this. But I've done something similar before with shell scripts and good ole' awk and sed.

    If you want to get lazy about it, perhaps you can use a ready-made HTML parser, like the html2txt. You will lose certain formatting, so some manual tweaking may be required. If you used the same page template for most of your current site, this should not be very difficult.

  8. mikey1
    Member
    Posted 5 years ago #

    Hi there, interesting post, as you are familiar with programming and html, you may want to take a look at microsofts free software
    Windows live writer.
    Its great for creating posts or pages, uses familiar html tables for layout, and also has a large number of its own plugins to help with layout.It also makes uploading images with its built in ftp very easy.
    This may sound like too easy an option for you, but as my site is mainly html, with wordpress and various other php and mysql systems running, thought it might give you some thought, good luck with the project.
    mike.

Topic Closed

This topic has been closed to new replies.

About this Topic