Using php and curl to automate drupal tasks such as node adding or user adding

Posted by andu
Fri, 2006-09-29 23:06

A while ago I posted how I was using php and curl to do automatic image uploading to Drupal and geolocating images. Some people requested the script, others arrive here from Google by searching about the same thing, so here is a small tutorial on how to do it.

First of all you'll need the libcurl extension to php installed. The code snippets I'll present here might work if you'll convert them to Snoopy, but libcurl is faster.

What we'll do: we are going to login into Drupal and we are going to perform different operations.

1. Logging in
The first step is to log in. We're going to send the credentials using POST to the /user/login page of our Drupal website. The trick here is to capture the cookie sent by the server to use it later.

Let's init curl and set some options: the site we're going to use along with the login page and a file on our server which will store the cookie. Be careful, the user that the webserver runs as needs to have write access there. CURLOPT_POST tells curl we're going to use POST to send form data.

  1. $crl = curl_init();
  2. $url = "http://www.example.com/user/login";
  3. curl_setopt($crl, CURLOPT_URL, $url);
  4. curl_setopt($crl, CURLOPT_COOKIEFILE, "/tmp/cookie.txt");
  5. curl_setopt($crl, CURLOPT_COOKIEJAR, "/tmp/cookie.txt");
  6. curl_setopt($crl, CURLOPT_FOLLOWLOCATION, 1);
  7. curl_setopt($crl, CURLOPT_RETURNTRANSFER, 1);
  8. curl_setopt($crl, CURLOPT_POST, 1);

Next, we need to set the username and the password of the user we're going to login and send the data to the login script. To do this, we need to inspect the standard drupal login form and see the fields names: they're edit[name], edit[pass], edit[form_id] and op.

Warning: Be careful to init all elements of the form because missing some will make the script unusable. Particularly, edit[form_id] is essential.

  1. // this array will hold the field names and values
  2. $postdata=array(
  3. "edit[name]"=>"admin",
  4. "edit[pass]"=>"password",
  5. "edit[form_id]"=>"user_login",
  6. "op"=>"Log in"
  7. );
  8. // tell curl we're going to send $postdata as the POST data
  9. curl_setopt ($crl, CURLOPT_POSTFIELDS, $postdata);

Now it's time to log in.

  1. $result=curl_exec($crl);
  2. $headers = curl_getinfo($crl);
  3. if ($headers['url'] == $url) {
  4. die("Cannot login.");
  5. }

What we're doing here is sending the form data to the Drupal login page. We then get the headers of the response and check if the returned url is the login one. If it is, then it means we couldn't log in. This happens because, after a succesful login, Drupal redirects to http://www.example.com/user/2, 2 being the id of the logged in user. In case of failing, it returns to the login page displaying an error.

If everything went ok, we're logged in. Now you can do more serious stuff. :) Let's continue with some examples:

2. Example: Adding an image

  1. $file = "test.jpg";
  2. $url = "http://www.example.com/node/add/image";
  3. curl_setopt($crl, CURLOPT_URL, $url);
  4. $postdata = array("edit[title]"=>$file,
  5. "edit[image]"=>"@$file.jpg",
  6. "edit[body]"=>"This is an image posted with cURL.",
  7. "op"=>"Submit",
  8. "edit[format]"=>"1",
  9. "edit[comment]"=>"2",
  10. "edit[name]"=>"admin",
  11. "edit[date]"=>"",
  12. "edit[status]"=>"1",
  13. "edit[moderate]"=>"0",
  14. "edit[promote]"=>"1",
  15. "edit[sticky]"=>"0",
  16. "edit[revision]"=>"0",
  17. "edit[images][_original]"=>"",
  18. "edit[images][thumbnail]"=>"",
  19. "edit[images][preview]"=>"",
  20. "edit[form_id]"=>"image_node_form"
  21. );
  22.  
  23. curl_setopt ($crl, CURLOPT_POSTFIELDS, $postdata);
  24.  
  25. $result=curl_exec($crl);
  26. $headers = curl_getinfo($crl);
  27. if ($headers['url'] == $url) {
  28. die("Cannot add the image.");
  29. }

I looked over the image add form generated by Drupal, noting the fields' names and their default values. Then, we fill this with our info and send the data. Because we're using the same resource, $crl, we're still logged. In the end we check if the resulting url is still "node/add/image". If it is, then some data wasn't filled right and Drupal returned us to the node add page.

What's with that @ in the code?

  1. "edit[image]"=>"@$file.jpg",

edit[image] is an input of type file. By using @, we tell cURL to fetch the contents of that file and send them to the add script.

3. Example: Adding an user

  1. $url = "http://www.example.com/admin/user/create";
  2. curl_setopt($crl, CURLOPT_URL, $url);
  3. $postdata = array(
  4. "edit[name]"=>"newuser",
  5. "edit[mail]"=>"newuser@example.com",
  6. "edit[password]"=>"password",
  7. "edit[notify]"=>1,
  8. "op"=>"Submit",
  9. "edit[form_id]"=>"user_register"
  10. );
  11.  
  12. curl_setopt ($crl, CURLOPT_POSTFIELDS, $postdata);
  13.  
  14. $result=curl_exec($crl);
  15. $headers = curl_getinfo($crl);
  16. if ($headers['url'] == $url) {
  17. die("Cannot add the image.");
  18. }

And that's basically it. Post any questions or suggestions here.

Trackback URL for this post:

http://voidberg.org/trackback/176

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Doug Gough (not verified) - Sat, 2006-12-09 22:29

Is it possible to send the taxonomy terms by curl as well? For instance, I have a category named "department" containing the terms "high", "middle", "primary". Could I submit a story node and make sure it's added to the "primary" category?

thanks,

duggoff

Pablo (not verified) - Sat, 2007-01-27 19:57

I've been trying to update my Drupal site using the (windows) command-line version of CURL but I'm not able to log in (curl return code: 0). Do you know of any differences between the php libcurl library and curl.exe as of syntax, etc.

many thanks and best regards,

Pablo

Anonymous (not verified) - Mon, 2007-02-26 23:41

So, do you have a full example? with login and adduser?

Anonymous (not verified) - Mon, 2007-02-26 23:49

Do you have a 5.1 example?
Also, do you have a full example, with login and user addition?

andu - Tue, 2007-02-27 09:53

Pablo, you should make sure that curl is configured to save and load cookie data from a writable directory.

Anonymous, didn't try on 5.1. Let me check if what I have breaks under 5. I'll get back to you.

Anonymous (not verified) - Tue, 2007-02-27 13:18

Thankyou. Please let us know how you tackle 5.1.
Also, are you able to retreive nodes by using curl too? Also, do you have 2 seperate scripts (login.php, add-user.php)? Do you run login.php first, and then add-user.php next?

OutThere (not verified) - Sat, 2007-03-24 17:50

I am authenticating with a ruby script.
Nevermind. The concepts should be the same.

I am not sure what happens in 5.1, but I get 2 different cookies from 5.1 server after posting my credentials. Now, the second cookie gets a coresponding session, which - I assume - means the script has authenticated. However. Even though the script authenticates and gets a 302 redir from the server, the coresponding session has UID 0. That means the cookie is useless to do anything with, as the user owning the session is anonymous and has no rights.

I do not know if this is endemic to Ruby or is it a Drupal thing. I guess I will have to hack Drupal to see what happens... Maybe there is a minimal set of headers in POST that has to be met in order for Drupal to actualy log you on. I sniffed arround and made a call IDENTICAL to Firefox 1.5 on linux. To no avail.

Take care

Joe (not verified) - Wed, 2007-04-25 14:42

I would love to see a working example in 5.1 in curl at least, but also using (MSXML2.)XMLHTTP. Any luck with this yet? I haven't had any yet, and I'm no expert. Thanks!

andu - Wed, 2007-04-25 15:32

The curl method doesn't work in 5.1 due to the fact that each form expects a token and I can't generate that. I might make another attempt at it but I don't think it's possible.

However, in Drupal 5 you have the possibility to programatically submit forms. There's some slim documentation on Drupal.org about this subject. I used it for a project so I might make a tutorial about it. The downside is that your script needs to have access to the drupal sources (i.e. no more updating from scripts from another domain).

OutThere, there was an issue with two cookies being sent in Drupal 4.7.5 in which case you couldn't log in anymore. It should have been fixed in Drupal 4.7.6 and 5. Maybe you're still experiencing this bug? Search for "drupal double login" for relevant information.

Joe, I don't get what you mean by example with XMLHTTP. Can you be more specific?

Joe (not verified) - Sat, 2007-04-28 14:58

(Lost my post - had to retype -- argggh!)

http://drupal.org/node/80548(Edit a Drupal page from another PHP site using cURL) shows an example that I've gotten to work of generating a token and creating a new node. This is a start, though I haven't gotten into it too deeply yet. If anyone wants to help flesh out that example, or explain it more, that would be great.

XMLHTTP is IE's object for AJAX, like XMLHttpRequest in other browsers. It's also available to any code running in Windows, and I'm looking to add content from inside VBA in an MS Office app using XMLHTTP instead of curl. XMLHTTP object has methods such as .open, .setrequestheader, .send, etc. So I need to convert the PHP/curl code to VBA/XMLHTTP code. That's what I'm looking to do...

Thanks for your replies!

Joe (not verified) - Sat, 2007-04-28 14:52

http://drupal.org/node/80548 (Edit a Drupal page from another PHP site using cURL) shows curl code to generate a token and I've been able to use this code to successfully add a node to my site. I think this is good news. But I haven't delved into it more deeply yet.

What I meant by XMLHTTP -- that's IE's AJAX object (like XMLHttpRequest), and I'd like to use it instead of curl because I'm scripting from MS VBA in an MS Office application. XMLHTTP object has .open, .setrequestheader, .send methods for example. So if I can figure out how to convert the above-referenced curl code to use the XMLHTTP object instead, I'd be happy. Thanks for your responses!

Joe (not verified) - Mon, 2007-05-14 13:21

I now have the ability to post any page to Drupal 5.1 from VB using MSXML2.XMLHTTP object. The two main issues were:

1. Drupal 5.1 tokens -- you must first request the page, grab the token and then use it as a POST parameter on a second call -- see http://drupal.org/node/80548 (Edit a Drupal page from another PHP site using cURL)
2. A documented Microsoft cookie bug - you must set the cookie twice in order for it to take effect -- see http://support.microsoft.com/kb/290899 (BUG: XMLHTTP Fails to Send Cookies from a Client).

jat32 (not verified) - Mon, 2007-07-16 19:59

I have reviewed the tutorials and snippets on the Drupal site (http://drupal.org/node/87711) and I have read several discussions about the topic. Yet I'm still having trouble.

After using PHP/Curl to login, I move to the node/add/image page and get the token no problem. But I'm having trouble posting a new image node. I can get everything working when posting a new page node with Curl. It's just posting a new image node that's giving me trouble.

Here is the code that should add the new image node. Curl fills out most of the image form fields such as title, body etc. All fields are populated by Curl EXCEPT the image file upload. And this seems to be what's stopping curl from creating the new image node. What am I doing wrong? How can I get Curl to fill in the file upload field?

  1. $ch = curl_init();
  2. curl_setopt($ch, CURLOPT_HEADER, 1);
  3. curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
  4. curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
  5. curl_setopt($ch, CURLOPT_URL,"http://www.example.com/?q=node/add/image/");
  6. curl_setopt($ch, CURLOPT_POST, 1);
  7.  
  8.  
  9. $postarray = array(
  10. 'files[image]' =>@$file,
  11. 'taxonomy[3]'=>'2',
  12. 'title' => $file,
  13. 'body' => 'xxxx',
  14. 'status' => '1',
  15. 'revision' => '1',
  16. 'comment' => '2',
  17. 'name' => 'mov',
  18. 'moderate' => '0',
  19. 'form_id' => 'image_node_form',
  20. 'form_token' => $token,
  21. 'op' => 'Submit'
  22.  
  23. );
  24.  
  25. curl_setopt($ch, CURLOPT_POSTFIELDS, $postarray);
  26.  
  27.  
  28. $buf2 = curl_exec ($ch);
  29.  
  30. curl_close ($ch);

andu - Tue, 2007-07-17 06:35

Try 'files[image]' =>"@$file" instead of 'files[image]' =>@$file. Notice the quotes.

Vincent's Tips and Tricks (not verified) - Fri, 2008-01-25 00:43

Nice article, and interesting reading, unfortunately a little out of my league.

Regards

Vincent

The World of Office, XP and Vista Tips and Tricks.

dinesh (not verified) - Tue, 2008-03-04 06:21

test cur

Jim (not verified) - Thu, 2008-03-27 14:56

$file = "test.jpg";
...
"edit[image]"=>"@$file.jpg",

I'm not really familiar with arrays. Doesn't that result in "test.jpg.jpg"?

Arald Jean-Charles (not verified) - Fri, 2008-09-12 05:11

Hi,
I like the solution proposed above. what would be the equivalent AS script to upload the image, specially the following code

"edit[image]"=>"@$file.jpg"

if I were to write a AS script of you code how would I translate that in AS (version2 or 3)

Post new comment

  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>. Beside the tag style "<foo>" it is also possible to use "[foo]".

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.