To exclude JavaScript from HTML code using Python, you can use regular expressions (re module) to remove all <script> tags and their contents. Here's a simple Python function to achieve this: import re
def exclude_javascript(html_code):
# Remove <script> tags and their contents
cleaned_html = re.sub(r'<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>', '', html_code, flags=re.IGNORECASE)
return cleaned_html
# Example usage
html_code_with_js = """
<html>
<head>
<title>Sample HTML with JavaScript</title>
</head>
<body>
<h1>Hello, world!</h1>
<script>alert('This is JavaScript!');</script>
<p>This is a paragraph.</p>
<script>
console.log('This is also JavaScript!');
</script>
</body>
</html>
"""
html_code_without_js = exclude_javascript(html_code_with_js)
print(html_code_without_js)
This function exclude_javascript() takes an HTML code string as input and returns the HTML code with all <script> tags and their contents removed. Make sure to test this code thoroughly on various HTML inputs to ensure it works as expected for your use case. Keep in mind that using regular expressions to parse HTML can sometimes have limitations, especially for complex HTML structures. For more robust HTML parsing and manipulation, consider using dedicated HTML parsing libraries such as BeautifulSoup. Tags: JavaScript Regular Expression exclude_javascript
|